[SPARK-12630][Python][MLlib][DOC] Update param descriptions in classification.py #10598

vijaykiran · 2016-01-05T10:39:56Z

Updates the param descriptions in python mllib's classification.py to be consistent. See [SPARK-11219] for more
details.

BryanCutler · 2016-01-05T19:34:46Z

python/pyspark/mllib/classification.py

remove blank lines here, and around other "Allowed values"

BryanCutler · 2016-01-05T19:51:41Z

thanks @vijaykiran! I marked a few things for correction and I think in general we should extend the comments to the 100 character limit where applicable.

BryanCutler · 2016-01-05T23:53:08Z

python/pyspark/mllib/classification.py

nit: no period after the ending parenthesis in default line

vijaykiran · 2016-01-06T09:29:48Z

@BryanCutler Thanks for the review, I added a new commit - will review update other two PRs with 100 fill-column as well.

Do you prefer a squashed commit ?

BryanCutler · 2016-01-07T19:02:31Z

LGTM. It's not necessary to squash commits, but could you update the PR title/description to indicate that this is for PySpark MLlib classification?

ping @jkbradley @mengxr

jkbradley · 2016-01-08T23:56:25Z

I just added a note to the parent JIRA about a formatting issue affecting all 5 PRs: [https://issues.apache.org/jira/browse/SPARK-11219?focusedCommentId=15090225&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15090225]
Could you please check it out & ping when I should review again? Thank you!

BryanCutler · 2016-01-14T22:23:09Z

Hi @vijaykiran , will you be able to update the param descriptions to be only up to 74 character limit? Sorry for the mixup, but I think this should be mergable after that.

vijaykiran · 2016-01-22T14:09:21Z

ping @BryanCutler - Updated the param descriptions, can you take a look ?

BryanCutler · 2016-01-22T18:42:46Z

I viewed in pydoc with a window size of 80 and LGTM, cc @jkbradley to take a look. Thanks @vijaykiran!

BryanCutler · 2016-01-22T20:26:39Z

python/pyspark/mllib/classification.py

This line is over the 74 char limit

BryanCutler · 2016-01-22T20:28:25Z

@vijaykiran , I missed one line that is a bit too long, could you please fix that, otherwise LGTM

Updates the `param` descriptions consistent. See [SPARK-11219] for more details.

@BryanCutler

- Style fixes based on review comments by @BryanCutler. - Changed fill-column to 100 instead of 80.

vijaykiran · 2016-01-23T07:16:19Z

ping @BryanCutler @jkbradley - can you guys please take a look again?

BryanCutler · 2016-01-25T18:37:28Z

LGTM

mengxr · 2016-01-25T19:56:19Z

@vijaykiran @BryanCutler Please run make html under python/docs and check the warning messages. This is what I got:

/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:23: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithSGD.train:26: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithLBFGS.train:17: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.LogisticRegressionWithLBFGS.train:20: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:23: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/classification.py:docstring of pyspark.mllib.classification.SVMWithSGD.train:26: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.LinearRegressionWithSGD:3: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.LinearRegressionWithSGD:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.RidgeRegressionWithSGD:3: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.RidgeRegressionWithSGD:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.LassoWithSGD:3: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.LassoWithSGD:4: WARNING: Block quote ends without a blank line; unexpected unindent.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.IsotonicRegression:7: ERROR: Unexpected indentation.
/Users/meng/src/spark/python/pyspark/mllib/regression.py:docstring of pyspark.mllib.regression.IsotonicRegression:12: ERROR: Unexpected indentation.

Please test the other two PRs as well. Thanks!

vijaykiran · 2016-01-25T20:07:25Z

@mengxr thanks, will take a look today

mengxr · 2016-01-25T20:27:23Z

@vijaykiran Thanks! Please ignore the warnings from regression.py. That will be addressed in https://issues.apache.org/jira/browse/SPARK-12986.

mengxr · 2016-01-27T01:34:39Z

ok to test

SparkQA · 2016-01-27T02:01:54Z

Test build #50152 has finished for PR 10598 at commit bf8f8a0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mengxr · 2016-02-02T18:53:48Z

@vijaykiran Could you fix the doc warnings in classification.py? Thanks!

BryanCutler · 2016-02-02T21:55:28Z

python/pyspark/mllib/classification.py

It looks like we have a problem with the formatting of this param. If we want the html to look the same and get rid of any warnings, then I think it has to be written like this

:param regType: The type of regularizer used for training our model. :Allowed values: - "l1" for using L1 regularization - "l2" for using L2 regularization - None for no regularization (default: "l2")

which, I think, looks ugly in the code and not even that great in the html. Alternatively, we could rewrite it inline similar to how some other params have it, like this

:param regType: The type of regularizer used for training our model. Allowed values: "l1" for using L1 regularization; "l2" for using L2 regularization; None for no regularization. (default: "l2")

Not perfect, but both code and html look decent, imho.
@mengxr , would you be ok with the second inline format?

The first one looks better but no strong preference. I tried the following that also works:

:param regType: The type of regularizer used for training our model. Allowed values: - "l1" for using L1 regularization - "l2" for using L2 regularization (default) - None for no regularization

Note Allow values is not a Sphinx keyword.

It's not a huge deal so I'm ok with either, but my issue with the additional blank lines is that in the context of the already large param list, the extra blank lines seem like a little much and the rendered html looks sloppy to me too. Since I noticed other parts describing allowed values in a single paragraph, I thought maybe we should stick to that.

If we use :Allowed values:, it shows up as :Allowed values: in html, which is weird. We should change existing ones instead of repeating the mistakes, in a separate PR. I don't think the rendered html changes due to that additional blank line. It is just control by top margin of a list, which could be configured via CSS. This is how it looks in html for my proposal, which looks good to me:

Let's quickly fix this one and merge this PR.

I believe that :Allowed values: would need to be preceded by a blank line for Sphinx to interpret it in bold like before, but I don't think that needs to be done here. What @mengxr proposed above looks fine with me too.

@vijaykiran , will you be able to make this fix soon? I could take over and finish this last little bit if you can't.

BryanCutler · 2016-02-12T18:11:58Z

Thanks for working on this @vijaykiran , I finished up the remaining fix in #11183 so please close this one up.
cc @mengxr

mengxr · 2016-02-12T22:25:51Z

@vijaykiran I merged #11183 into master. Do you mind closing this PR? Thanks!

BryanCutler reviewed Jan 5, 2016
View reviewed changes

python/pyspark/mllib/classification.py Outdated

Copy link

Member

BryanCutler Jan 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove blank lines here, and around other "Allowed values"

BryanCutler reviewed Jan 5, 2016
View reviewed changes

python/pyspark/mllib/classification.py Outdated

Copy link

Member

BryanCutler Jan 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no period after the ending parenthesis in default line

vijaykiran changed the title ~~[SPARK-12630][DOC] Update param descriptions~~ [SPARK-12630][Python][MLib][DOC] Update param descriptions Jan 7, 2016

vijaykiran changed the title ~~[SPARK-12630][Python][MLib][DOC] Update param descriptions~~ [SPARK-12630][Python][MLlib][DOC] Update param descriptions Jan 7, 2016

vijaykiran force-pushed the master branch 2 times, most recently from 178d0b3 to 517421d Compare January 22, 2016 14:08

BryanCutler reviewed Jan 22, 2016
View reviewed changes

python/pyspark/mllib/classification.py Outdated

Copy link

Member

BryanCutler Jan 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is over the 74 char limit

vijaykiran added 3 commits January 23, 2016 08:13

[SPARK-12630][DOC] Update param descriptions

e74831f

Updates the `param` descriptions consistent. See [SPARK-11219] for more details.

Style fixes

6cb46ca

- Style fixes based on review comments by @BryanCutler. - Changed fill-column to 100 instead of 80.

Limit parameter desciptions to 74 columns

cbd9d08

vijaykiran changed the title ~~[SPARK-12630][Python][MLlib][DOC] Update param descriptions~~ [SPARK-12630][Python][MLlib][DOC] Update param descriptions in classification.py Jan 23, 2016

A couple of more 74 column fixes

bf8f8a0

vijaykiran force-pushed the master branch from 517421d to bf8f8a0 Compare January 23, 2016 07:15

BryanCutler reviewed Feb 2, 2016
View reviewed changes

vijaykiran closed this Feb 13, 2016

BryanCutler mentioned this pull request Feb 23, 2016

[SPARK-12633][Python][MLlib][DOC] Update param descriptions in regression.py #10600

Closed

[SPARK-12630][Python][MLlib][DOC] Update param descriptions in classification.py #10598

[SPARK-12630][Python][MLlib][DOC] Update param descriptions in classification.py #10598

Uh oh!

Conversation

vijaykiran commented Jan 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BryanCutler commented Jan 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vijaykiran commented Jan 6, 2016

Uh oh!

BryanCutler commented Jan 7, 2016

Uh oh!

jkbradley commented Jan 8, 2016

Uh oh!

BryanCutler commented Jan 14, 2016

Uh oh!

vijaykiran commented Jan 22, 2016

Uh oh!

BryanCutler commented Jan 22, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BryanCutler commented Jan 22, 2016

Uh oh!

vijaykiran commented Jan 23, 2016

Uh oh!

BryanCutler commented Jan 25, 2016

Uh oh!

mengxr commented Jan 25, 2016

Uh oh!

vijaykiran commented Jan 25, 2016

Uh oh!

mengxr commented Jan 25, 2016

Uh oh!

mengxr commented Jan 27, 2016

Uh oh!

SparkQA commented Jan 27, 2016

Uh oh!

mengxr commented Feb 2, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

BryanCutler commented Feb 12, 2016

Uh oh!

mengxr commented Feb 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants