-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-11893] Model export/import for spark.ml: TrainValidationSplit #9971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #46685 has finished for PR 9971 at commit
|
|
test it please |
|
Test build #49134 has finished for PR 9971 at commit
|
|
test it please |
|
Test build #49140 has finished for PR 9971 at commit
|
|
Ping @jkbradley |
|
@yinxusen Hi, sorry for the long wait. I'd like to review this now. Could you please rebase from master? |
|
Sure will do it soon Sent from my iPhone
|
|
test it please |
|
Test build #53973 has finished for PR 9971 at commit
|
|
test it please |
|
Test build #53976 has finished for PR 9971 at commit
|
|
Ok thanks! I'll review this now |
|
Initial comment: I like the idea of testing Validators jointly, but I don't think creating a MyValidator class is the best way since it involves a lot of new code and since that new code is mimicking what is already in CrossValidator. I'd prefer to keep the original CrossValidatorSuite tests since I believe those test the same functionality. |
| * Examine the given estimator (which may be a compound estimator) and extract a mapping | ||
| * from UIDs to corresponding [[Params]] instances. | ||
| */ | ||
| def getUidMap(instance: Params): Map[String, Params] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to use this in CrossValidator as well? There is still a copy in CrossValidator.scala.
Also, this method and its helper should probably live in ml/util/ReadWrite.scala since they apply to tuning, Pipeline, and classification (OneVsRest)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, it's an error imported by merging with master. I'll fix it soon.
|
I'll hold off on a detailed review until these initial items are addressed since they will require significant code movement. |
|
Sure I'll ping you after that. Sent from my iPhone
|
|
Test build #54086 has finished for PR 9971 at commit
|
|
Ping @jkbradley, I've fixed those. You can move on now. |
|
Test build #54089 has finished for PR 9971 at commit
|
| case v: ValidatorParams => Array(v.getEstimator, v.getEvaluator) | ||
| case ovr: OneVsRestParams => | ||
| // TODO: SPARK-11892: This case may require special handling. | ||
| throw new UnsupportedOperationException("CrossValidator write will fail because it" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"CrossValidator" --> ${instance.getClass.getName}
|
Thanks! I just made another pass. I'll check again since this will require a bit more code movement. |
|
Thanks @jkbradley I moved the validateParams into ValidatorParams, and changed the |
|
Test build #54220 has finished for PR 9971 at commit
|
|
Test build #54225 has finished for PR 9971 at commit
|
|
@jkbradley OK for another look |
|
@yinxusen Thanks for the update. It looks good, but there was one missed item + a few more I found. I'm going to send a PR to update this PR. |
|
I just sent a PR: [https://github.com/yinxusen/pull/5] |
Yinxusen spark 11893 cleanups
|
Merged! @jkbradley |
|
test it please |
|
Test build #54355 has finished for PR 9971 at commit
|
|
LGTM |
https://issues.apache.org/jira/browse/SPARK-11893
@jkbradley In order to share read/write with
TrainValidationSplit, I move theSharedReadWriteout ofCrossValidatorinto a new traitSharedReadWritein the tunning package.To reduce the repeated tests, I move the complex tests from
CrossValidatorSuitetoSharedReadWriteSuite, and create a fake validator calledMyValidatorto test the shared code.With
SharedReadWrite, potential newly addedValidatorcan share the read/write common part, and only need to implement their extra params save/load.