-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-9805] [MLLIB] [PYTHON] [STREAMING] Added _eventually for ml streaming pyspark tests #8087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
python/pyspark/mllib/tests.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a slight possibility that between the last time term_check() is called in the _ssc_wait_checked, and next time its called in this method, another batch may have been processed, which which fail the test unnecessarily. So a better approach would be for the _ssc_wait_checked method to return True if the term_check() has succeeded within the timeout, otherwise return false. Then there is not need to check term_check() once again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these tests, they should pass whenever all batches have been processed, so the current setup should be safe. I'm actually thinking of copying the checks so that assertions print out more useful error messages. (I don't see a great way to avoid copying the checks if I want them for both early stopping & useful error messages.)
|
Test build #40354 has finished for PR 8087 at commit
|
|
Test build #40357 has finished for PR 8087 at commit
|
|
Jenkins test this please |
|
Test build #40495 has finished for PR 8087 at commit
|
|
Test build #40502 has finished for PR 8087 at commit
|
|
Yay it passed! If this looks reasonable, I'll make similar changes for the other streaming ML pyspark tests. |
|
Nice! I think this is a solid strategy. Maybe in the next round of changes make that |
|
I think you can make a generic equivalent of scalatest Then thats solves the problem I alluded to earlier about a possible race condition. |
|
@tdas Sure, I can do that. I don't think the race condition matters for ML tests (or if it does, then the test was written incorrectly), but that does clarify semantics. I guess I'll have to duplicate the check code no matter what to get nice error messages. |
|
Actually, I'm going to switch the design to instead:
That will allow us to (a) avoid copying the set of checks and (b) take advantage of the many assertion variants, including approximate equality. AFAIK, the overhead in catching errors should be negligible compared to the time for the tests. (Correct me if I'm wrong here.) |
|
Test build #40578 has finished for PR 8087 at commit
|
|
Jenkins test this please |
|
What if |
|
Yeah, I should document that. I made sure to make condition() work for those cases (e.g., checking result array length instead of the values in the result array which might not yet exist). |
|
Test build #40598 has finished for PR 8087 at commit
|
|
Test build #1474 has finished for PR 8087 at commit
|
|
Working on improvements... |
|
OK everyone, I think that should fix things...but we'll wait and see. I changed the logic of eventually to support the 2 types of tests: ones which have a simple condition to check and cannot stop early, and ones which can stop early if all batches have been processed. |
|
Test build #40678 has finished for PR 8087 at commit
|
|
Test build #40688 has finished for PR 8087 at commit
|
|
LGTM. @tdas Do you want to make a final pass? |
|
Increasing timing in the spirit of robustness...and testing again for fun. |
|
But yeah @tdas I'll wait for your final OK |
|
Test build #40816 has finished for PR 8087 at commit
|
|
LGTM! |
|
OK, I'll merge this with master and branch-1.5 then. Thanks for reviewing, everyone! |
…reaming pyspark tests Recently, PySpark ML streaming tests have been flaky, most likely because of the batches not being processed in time. Proposal: Replace the use of _ssc_wait (which waits for a fixed amount of time) with a method which waits for a fixed amount of time but can terminate early based on a termination condition method. With this, we can extend the waiting period (to make tests less flaky) but also stop early when possible (making tests faster on average, which I verified locally). CC: mengxr tdas freeman-lab Author: Joseph K. Bradley <[email protected]> Closes #8087 from jkbradley/streaming-ml-tests. (cherry picked from commit 1db7179) Signed-off-by: Joseph K. Bradley <[email protected]>
…reaming pyspark tests Recently, PySpark ML streaming tests have been flaky, most likely because of the batches not being processed in time. Proposal: Replace the use of _ssc_wait (which waits for a fixed amount of time) with a method which waits for a fixed amount of time but can terminate early based on a termination condition method. With this, we can extend the waiting period (to make tests less flaky) but also stop early when possible (making tests faster on average, which I verified locally). CC: mengxr tdas freeman-lab Author: Joseph K. Bradley <[email protected]> Closes apache#8087 from jkbradley/streaming-ml-tests.
Recently, PySpark ML streaming tests have been flaky, most likely because of the batches not being processed in time. Proposal: Replace the use of _ssc_wait (which waits for a fixed amount of time) with a method which waits for a fixed amount of time but can terminate early based on a termination condition method. With this, we can extend the waiting period (to make tests less flaky) but also stop early when possible (making tests faster on average, which I verified locally).
CC: @mengxr @tdas @freeman-lab