Skip to content

Conversation

@yinxusen
Copy link
Contributor

@yinxusen yinxusen commented Dec 8, 2015

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47306 has finished for PR 10186 at commit a5e72ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ChiSqSelector(JavaEstimator, HasFeaturesCol, HasOutputCol, HasLabelCol):\n * class ChiSqSelectorModel(JavaModel):\n

@SparkQA
Copy link

SparkQA commented Dec 8, 2015

Test build #47307 has finished for PR 10186 at commit f49e231.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ChiSqSelector(JavaEstimator, HasFeaturesCol, HasOutputCol, HasLabelCol):\n * class ChiSqSelectorModel(JavaModel):\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove #

@yanboliang
Copy link
Contributor

Looks good except minor issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think we probably don't need the "#"s in the pydoc

@yinxusen
Copy link
Contributor Author

Thanks for comments @holdenk and @yanboliang. It's so strange that I cannot see comments from @yanboliang in this page. It must be a Github issue.

I don't know whether we can catch up for 1.6? If not, I'll change the tag into 1.7 later.

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47462 has finished for PR 10186 at commit 657a0d4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):\n * class ChiSqSelector(JavaEstimator, HasFeaturesCol, HasOutputCol, HasLabelCol):\n * class ChiSqSelectorModel(JavaModel):\n

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model is loadable and saveable in Java, I don't see us doing this elsewhere in ml/ yet (although we do it in mllib/) but do we maybe want to use the JavaLoader & JavaSaveable base classes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model persistence is important in PySpark, but there is no need to add it in this PR. @yanboliang has a JIRA for adding pipeline persistence in PySpark: https://issues.apache.org/jira/browse/SPARK-11939

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add the selectedFeatures method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@yanboliang
Copy link
Contributor

LGTM

@thunterdb
Copy link
Contributor

LGTM cc @jkbradley

@yinxusen
Copy link
Contributor Author

Change the version to 2.2.0

@SparkQA
Copy link

SparkQA commented Jan 11, 2016

Test build #49141 has finished for PR 10186 at commit aa9d40f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indent 1 more space (this line + next line)

@jkbradley
Copy link
Member

Thanks for the PR! I only had a couple more comments.

@yinxusen
Copy link
Contributor Author

test it please

@SparkQA
Copy link

SparkQA commented Jan 12, 2016

Test build #49243 has finished for PR 10186 at commit 0bd1271.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @jkbradley

I use a javaSelectedFeatures because I find that if I use self._call_java("selectedFeatures"), it returns a array('i', [3]), which is strange since the result should be [3]. I doubt that there is something wrong in SerDe.dumps(javaObject) in Scala side then deserialize it in Python side with Scala Array.

@yinxusen
Copy link
Contributor Author

@jkbradley I also find an inconsistency returning value so I leave a JIRA here: https://issues.apache.org/jira/browse/SPARK-12780

@yinxusen
Copy link
Contributor Author

And what's more, the CountVectorizerModel.vocabulary in feature.py should return an array according to its Scala part, which is also a Scala Array. However, it returns a Python Tuple.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed for Java, so I'd make it a private API. But hopefully we can remove this altogether once [https://github.com//pull/10724] gets merged.

@jkbradley
Copy link
Member

@yinxusen Thanks! Let's get your other PR in first, and then update this PR.

@yinxusen
Copy link
Contributor Author

@jkbradley Yes, sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkbradley I learn from @holdenk' PR #10085, we can transform the JavaArray directly with list in Python. So there is no need to call _call_java().

@SparkQA
Copy link

SparkQA commented Jan 15, 2016

Test build #49453 has finished for PR 10186 at commit 223fdf4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

See comment on [https://github.com//pull/10724]. We'll return to this PR after [https://github.com//pull/10772] gets merged.

@yinxusen
Copy link
Contributor Author

test it please

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50112 has finished for PR 10186 at commit 3fca95e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yinxusen
Copy link
Contributor Author

@jkbradley Another PR related to #10772.

@jkbradley
Copy link
Member

LGTM
Merging with master
Thanks for this & the other fix!

@asfgit asfgit closed this in 8beab68 Jan 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants