-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-22987][Core] UnsafeExternalSorter cases OOM when invoking getIterator function.
#20184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@liutang123 , can you please tell us how to produce your issue easily? |
|
Hi, @jerryshao , we can produce this issue as follows: This will cause OOM. |
|
Thanks, let me try to reproduce it locally. |
|
The code here should be fine for normal case. The problem is that there're so many spill files, which requires to maintain lots of handler's buffer. A lazy buffer allocation could solve this problem, IIUC. It is not related to queue or something else. |
|
I think that a lazy buffer allocation can not thoroughly solve this problem because UnsafeSorterSpillReader has BufferedFileInputStream witch will allocate off heap memory. |
Can you please explain more. From my understanding the off heap memory in |
b6a8645 to
db138d1
Compare
|
hi, @jerryshao , I try lazily allocate all the InputStream and byte arr in UnsafeSorterSpillReader. |
|
cc @jiangxb1987 |
| baseObject = arr; | ||
| } | ||
| ByteStreams.readFully(in, arr, 0, recordLength); | ||
| ByteStreams.readFully(getIn(), arr, 0, recordLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine if recordLength is greater than 1024 * 1024?
|
I stumbled across this PR while looking through the open Spark core PRs. It sounds like the problem here is that we don't need to allocate the input stream and read buffer until it's actually time to read the spill, but we're currently doing that too early:
Given this context, lazy initialization makes sense to me. However, this PR is a bit outdated and has some merge conflicts. I would be supportive of this change if the conflicts are resolved and the PR description is updated. |
|
Can one of the admins verify this patch? |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
ChainedIterator.UnsafeExternalSorter remains a Queue of UnsafeSorterIterator. When call
getIteratorfunction of UnsafeExternalSorter, UnsafeExternalSorter passes an ArrayList of UnsafeSorterSpillReader to the constructor of UnsafeExternalSorter. But, UnsafeSorterSpillReader maintains a byte array as buffer, witch capacity is more than 1 MB. When spilling frequently, this case maybe causes OOM.In this PR, I try to change buffer allocation in UnsafeSorterSpillReader lazily.
How was this patch tested?
Existing tests.