-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-19590][pyspark][ML] Update the document for QuantileDiscretizer in pyspark #16922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #72850 has finished for PR 16922 at commit
|
25bdc0f to
9ce7cb8
Compare
|
Test build #72851 has finished for PR 16922 at commit
|
…r in pyspark This PR is to document the change on QuantileDiscretizer in pyspark for PR: apache#15428 Signed-off-by: VinceShieh <[email protected]>
9ce7cb8 to
c5e46fb
Compare
|
Test build #72852 has finished for PR 16922 at commit
|
holdenk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for following through and updating the Python documentation as well :) One minor question :)
python/pyspark/ml/feature.py
Outdated
| NaN handling: Note also that | ||
| QuantileDiscretizer will raise an error when it finds NaN values in the dataset, but the user | ||
| can also choose to either keep or remove NaN values within the dataset by setting | ||
| `handleInvalid`. If the user chooses to keep NaN values, they will be handled specially and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we maybe link this with a py attr like we did with numBuckets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sure. Thanks for pointing that out... ;)
Signed-off-by: VinceShieh <[email protected]>
|
Test build #72908 has finished for PR 16922 at commit
|
|
Thanks for doing the follow up Python work, merged to master :) |
…r in pyspark ## What changes were proposed in this pull request? This PR is to document the changes on QuantileDiscretizer in pyspark for PR: apache#15428 ## How was this patch tested? No test needed Signed-off-by: VinceShieh <vincent.xieintel.com> Author: VinceShieh <[email protected]> Closes apache#16922 from VinceShieh/spark-19590.
What changes were proposed in this pull request?
This PR is to document the changes on QuantileDiscretizer in pyspark for PR:
#15428
How was this patch tested?
No test needed
Signed-off-by: VinceShieh [email protected]