-
Notifications
You must be signed in to change notification settings - Fork 28.9k
SPARK-9690 Adding the possibility to set the seed of the rand in the … #7997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-9690 Adding the possibility to set the seed of the rand in the … #7997
Conversation
…CrossValidator fold
|
Can one of the admins verify this patch? |
|
that seems reasonable -- @davies ? |
|
LGTM. cc @mengxr Should we merge this into 1.5? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this line too.
|
Please make CrossValidator inherit from HasSeed. |
|
@mmenestret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also not necessary if we make CrossValidator extend HasSeed.
|
Ping! |
|
Ping. Please let me know if you don't have time to push an update. |
|
Im sorry im on holyday right now, i'll be back in a week, is it ok ? Envoyé depuis mon HTC ----- Reply message ----- Ping. Please let me know if you don't have time to push an update. — |
|
Oh OK no problem. Enjoy your holiday! |
|
Ping! Btw, the 1.6 code freeze is scheduled for the end of this month. |
|
@mmenestret I'm going to create a new PR based on yours. You'll still be the primary author. Please close this issue/PR, thanks. |
Extend CrossValidator with HasSeed in PySpark. This PR replaces [#7997] CC: yanboliang thunterdb mmenestret Would one of you mind taking a look? Thanks! Author: Joseph K. Bradley <[email protected]> Author: Martin MENESTRET <[email protected]> Closes #10268 from jkbradley/pyspark-cv-seed.
…CrossValidator fold
The fold in the ML CrossValidator depends on a rand whose seed is set to 0 and it leads the sql.functions rand to call sc._jvm.functions.rand() with no seed.
In order to be able to unit test a Cross Validation it would be a good idea to be able to set this seed so the output of the cross validation (with a featureSubsetStrategy set to "all") would always be the same.