Skip to content

Conversation

@yanboliang
Copy link
Contributor

Currently LogisticRegressionWithLBFGS in python/pyspark/mllib/classification.py will invoke callMLlibFunc with a wrong "regType" parameter.
It was assigned to "str(regType)" which translate None(Python) to "None"(Java/Scala). The right way should be translate None(Python) to null(Java/Scala) just as what we did at LogisticRegressionWithSGD.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@yanboliang yanboliang changed the title correct LogisticRegressionWithLBFGS regType parameter for pyspark [SPARK-6080] [PySpark] correct LogisticRegressionWithLBFGS regType parameter for pyspark Feb 28, 2015
@mengxr
Copy link
Contributor

mengxr commented Mar 1, 2015

add to whitelist

@mengxr
Copy link
Contributor

mengxr commented Mar 1, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28150 has started for PR 4831 at commit 12db65a.

  • This patch merges cleanly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use str(regType) if regType else None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think directly use regType is enough, because py4j can translate "l1","l2",None in Python to "l1","l2",null in Java/Scala smoothly.
I found the peer functions LogisticRegressionWithSGD and SVMWithSGD also directly use regType and it can work well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is the error message. If we use str(...), the Java function call would be successful and inside the Java function, we say the input regType is not recognized.

If we don't use str(..), if users input something like 1, 2. Py4j will throw an error saying there is no Java function matching the input signature, which usually confuses users.

Anyway, this is a minor issue for regType.

@SparkQA
Copy link

SparkQA commented Mar 1, 2015

Test build #28150 has finished for PR 4831 at commit 12db65a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28150/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented Mar 2, 2015

LGTM. Merged into master and branch-1.3. Thanks!

@asfgit asfgit closed this in af2effd Mar 2, 2015
asfgit pushed a commit that referenced this pull request Mar 2, 2015
…rameter for pyspark

Currently LogisticRegressionWithLBFGS in python/pyspark/mllib/classification.py will invoke callMLlibFunc with a wrong "regType" parameter.
It was assigned to "str(regType)" which translate None(Python) to "None"(Java/Scala). The right way should be translate None(Python) to null(Java/Scala) just as what we did at LogisticRegressionWithSGD.

Author: Yanbo Liang <[email protected]>

Closes #4831 from yanboliang/pyspark_classification and squashes the following commits:

12db65a [Yanbo Liang] correct LogisticRegressionWithLBFGS regType parameter for pyspark

(cherry picked from commit af2effd)
Signed-off-by: Xiangrui Meng <[email protected]>
@yanboliang yanboliang deleted the pyspark_classification branch April 24, 2015 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants