-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-12688][SQL] Fix spill size metric in unsafe external sorter #10634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
2fd3ab6
81e3227
be67488
416d73d
d1e9f7d
4486071
d689873
4cc0862
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -124,8 +124,12 @@ public UnsafeKVExternalSorter( | |
| inMemSorter); | ||
|
|
||
| // reset the map, so we can re-use it to insert new records. the inMemSorter will not used | ||
| // anymore, so the underline array could be used by map again. | ||
| map.reset(); | ||
| // anymore, so the underline array could be used by map again. When the sorter spills, it | ||
| // increases zero to the number of in-memory bytes spilled because the records are stored | ||
| // in the map instead of the sorter. So we need update the metric after resetting the map | ||
| // here. | ||
| final long spillSize = map.reset(); | ||
| taskContext.taskMetrics().incMemoryBytesSpilled(spillSize); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to my understanding of spill metrics, a spill needs to update both If this does turn out to be the right place for this spill, it would be great to add a code comment explaining the rationale for why this call must be here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ping @carsonwang, do you plan to update this PR to address my comment?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the delay, @JoshRosen . I will update this soon. |
||
| } | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Sql aggregation, the
spillSizehere is 0 because the data are stored in a map instead of this sorter. SoincMemoryBytesSpilled(spillSize)actually increase 0. We need update theMemoryBytesSpilledafter freeing the memory in the map.