-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
|
QA tests have started for PR 2336 at commit
|
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to do something similar to what you did in #2338 here, i.e. do it only if this is local mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rebase it after #2338 is merged.
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
python/pyspark/worker.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few lines prior to this, there was a comment
# CloudPickler needs to be imported so that depicklers are registered using the
# copy_reg module.
If this import is no longer necessary (was it ever?), then we should delete that comment, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldpickle is imported by serializers, so it's not needed here. The comments are removed.
|
This looks good to me. |
|
QA tests have started for PR 2336 at commit
|
|
QA tests have finished for PR 2336 at commit
|
Aggregate the number of bytes spilled into disks during aggregation or sorting, show them in Web UI.
This patch is blocked by SPARK-3465. (It includes a fix for that).