Skip to content

Conversation

@VinceShieh
Copy link

What changes were proposed in this pull request?

This PR is to document the changes on QuantileDiscretizer in pyspark for PR:
#15428

How was this patch tested?

No test needed

Signed-off-by: VinceShieh [email protected]

@VinceShieh VinceShieh changed the title [SPARK-19590][pyspark][ML] update the document for QuantileDiscretize… [SPARK-19590][pyspark][ML] Update the document for QuantileDiscretizer in pyspark Feb 14, 2017
@SparkQA
Copy link

SparkQA commented Feb 14, 2017

Test build #72850 has finished for PR 16922 at commit 25bdc0f.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 14, 2017

Test build #72851 has finished for PR 16922 at commit 9ce7cb8.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

…r in pyspark

This PR is to document the change on QuantileDiscretizer in pyspark for PR:
apache#15428

Signed-off-by: VinceShieh <[email protected]>
@SparkQA
Copy link

SparkQA commented Feb 14, 2017

Test build #72852 has finished for PR 16922 at commit c5e46fb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following through and updating the Python documentation as well :) One minor question :)

NaN handling: Note also that
QuantileDiscretizer will raise an error when it finds NaN values in the dataset, but the user
can also choose to either keep or remove NaN values within the dataset by setting
`handleInvalid`. If the user chooses to keep NaN values, they will be handled specially and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we maybe link this with a py attr like we did with numBuckets?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sure. Thanks for pointing that out... ;)

Signed-off-by: VinceShieh <[email protected]>
@SparkQA
Copy link

SparkQA commented Feb 15, 2017

Test build #72908 has finished for PR 16922 at commit 56e708f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Feb 15, 2017

Thanks for doing the follow up Python work, merged to master :)

@asfgit asfgit closed this in 6eca21b Feb 15, 2017
cmonkey pushed a commit to cmonkey/spark that referenced this pull request Feb 16, 2017
…r in pyspark

## What changes were proposed in this pull request?
This PR is to document the changes on QuantileDiscretizer in pyspark for PR:
apache#15428

## How was this patch tested?
No test needed

Signed-off-by: VinceShieh <vincent.xieintel.com>

Author: VinceShieh <[email protected]>

Closes apache#16922 from VinceShieh/spark-19590.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants