[SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API #19857

WeichenXu123 · 2017-12-01T07:00:31Z

What changes were proposed in this pull request?

This is python api for #19350

Python CrossValidator/TrainValidationSplit:
With base Estimator implemented in Scala/Java
→ Convert base Estimator to Scala/Java object, and call the JVM fit()
With base Estimator implemented in Python
→ Python needs the same machinery for multi-model fitting and parallelism as Scala. We can call directly into it. New API added:

class Estimator:
  def parallelFit(self, dataset, paramMaps, threadPool, modelCallback):

Doc link

Note This PR also fix the # TODO: persist average/validation metrics as well in CV/TVS. Because the testsuite need to check consistency of avgMetrics/validationMetrics so this need to be fixed.
If this need backport to old spark version, I can split it to a separate PR.

How was this patch tested?

Existing UT already covers each code paths which need test.

WeichenXu123 · 2017-12-01T07:02:04Z

@MrBago @jkbradley I think this PR need to be reviewed and merged first, before reviewing #19627
Because this PR change some critical code path.

SparkQA · 2017-12-01T07:03:47Z

Test build #84372 has finished for PR 19857 at commit 980c8ec.

This patch fails Python style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-12-01T07:33:06Z

Test build #84373 has finished for PR 19857 at commit c6f2250.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

WeichenXu123 · 2017-12-19T02:47:55Z

The design of this issue changed. @MrBago will take this over.

init pr

980c8ec

WeichenXu123 mentioned this pull request Dec 1, 2017

[SPARK-21088][ML] CrossValidator, TrainValidationSplit support collect all models when fitting: Python API #19627

Closed

fix python style

c6f2250

WeichenXu123 changed the title ~~[SPARK-22667][ML] Fix model-specific optimization support for ML tuning: Python API~~ [SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API Dec 13, 2017

WeichenXu123 closed this Dec 19, 2017

WeichenXu123 deleted the fix_model_spec_optim_py branch December 19, 2017 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API #19857

[SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API #19857

Uh oh!

WeichenXu123 commented Dec 1, 2017 •

edited

Loading

Uh oh!

WeichenXu123 commented Dec 1, 2017

Uh oh!

SparkQA commented Dec 1, 2017

Uh oh!

SparkQA commented Dec 1, 2017

Uh oh!

WeichenXu123 commented Dec 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API #19857

[SPARK-22667][ML][WIP] Fix model-specific optimization support for ML tuning: Python API #19857

Uh oh!

Conversation

WeichenXu123 commented Dec 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

WeichenXu123 commented Dec 1, 2017

Uh oh!

SparkQA commented Dec 1, 2017

Uh oh!

SparkQA commented Dec 1, 2017

Uh oh!

WeichenXu123 commented Dec 19, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WeichenXu123 commented Dec 1, 2017 •

edited

Loading