-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-5363] [PySpark] check ending mark in non-block way #4601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #27463 has started for PR 4601 at commit
|
|
Test build #27463 has finished for PR 4601 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "Tt"
|
Can we add logging for the uncommon cases here? I'd add a log message for the case where the next integer is not available and a second case for when it's not END_OF_STREAM (this log message should contain the actual integer received). |
|
Test build #27598 has started for PR 4601 at commit
|
|
Test build #27600 has started for PR 4601 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "ended cleanly" case can probably stay at logInfo (or maybe logDebug), but I think we should make this case and the other error-case into warnings so that they aren't swallowed at lower log levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a normal case, could happen when user calls take(), I think it should be INFO.
For the first one, INFO or DEBUG both work for me.
|
Minor log-level nitpicking aside, this looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like use INFO here, then we can expect one logging for each task, saying it's re-used or not.
|
@JoshRosen updated |
|
Test build #27602 has started for PR 4601 at commit
|
|
Test build #27602 has finished for PR 4601 at commit
|
|
Test FAILed. |
|
Test build #27600 has finished for PR 4601 at commit
|
|
Test FAILed. |
|
Test build #27598 has finished for PR 4601 at commit
|
|
Test PASSed. |
|
LGTM. I'm going to merge this into |
There is chance of dead lock that the Python process is waiting for ending mark from JVM, but which is eaten by corrupted stream. This PR checks the ending mark from Python in non-block way, so it will not blocked by Python process. There is a small chance that the ending mark is sent by Python process but not available right now, then Python process will not be used. cc JoshRosen pwendell Author: Davies Liu <[email protected]> Closes #4601 from davies/freeze and squashes the following commits: e15a8c3 [Davies Liu] update logging 890329c [Davies Liu] Merge branch 'freeze' of github.com:davies/spark into freeze 2bd2228 [Davies Liu] add more logging 656d544 [Davies Liu] Update PythonRDD.scala 05e1085 [Davies Liu] check ending mark in non-block way (cherry picked from commit ac6fe67) Signed-off-by: Josh Rosen <[email protected]>
There is chance of dead lock that the Python process is waiting for ending mark from JVM, but which is eaten by corrupted stream. This PR checks the ending mark from Python in non-block way, so it will not blocked by Python process. There is a small chance that the ending mark is sent by Python process but not available right now, then Python process will not be used. cc JoshRosen pwendell Author: Davies Liu <[email protected]> Closes #4601 from davies/freeze and squashes the following commits: e15a8c3 [Davies Liu] update logging 890329c [Davies Liu] Merge branch 'freeze' of github.com:davies/spark into freeze 2bd2228 [Davies Liu] add more logging 656d544 [Davies Liu] Update PythonRDD.scala 05e1085 [Davies Liu] check ending mark in non-block way (cherry picked from commit ac6fe67) Signed-off-by: Josh Rosen <[email protected]>
|
I merged this into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JoshRosen This does not work very well in practice, it's common to see some workers can not be re-used, I will try to find a better solution, or revert this? (because it seems that it did not solve the freeze problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's revert and continue to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
|
Reverted in |
There is chance of dead lock that the Python process is waiting for ending mark from JVM, but which is eaten by corrupted stream. This PR checks the ending mark from Python in non-block way, so it will not blocked by Python process. There is a small chance that the ending mark is sent by Python process but not available right now, then Python process will not be used. cc JoshRosen pwendell Author: Davies Liu <[email protected]> Closes apache#4601 from davies/freeze and squashes the following commits: e15a8c3 [Davies Liu] update logging 890329c [Davies Liu] Merge branch 'freeze' of github.com:davies/spark into freeze 2bd2228 [Davies Liu] add more logging 656d544 [Davies Liu] Update PythonRDD.scala 05e1085 [Davies Liu] check ending mark in non-block way (cherry picked from commit ac6fe67) Signed-off-by: Josh Rosen <[email protected]>
There is chance of dead lock that the Python process is waiting for ending mark from JVM, but which is eaten by corrupted stream.
This PR checks the ending mark from Python in non-block way, so it will not blocked by Python process.
There is a small chance that the ending mark is sent by Python process but not available right now, then Python process will not be used.
cc @JoshRosen @pwendell