-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-21860][core]Improve memory reuse for heap memory in HeapMemoryAllocator
#19077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This and the JIRA need a better title |
srowen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel so qualified to review this but it does make sense. In practice, is it common to allocate arrays of size +/- 8 bytes? just wondering if we have any idea of the impact
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Factor out arraySize * 8 as alignedSize or something
f168325 to
6862403
Compare
HeapMemoryAllocator
|
Test build #81208 has finished for PR 19077 at commit
|
|
Test build #81212 has finished for PR 19077 at commit
|
6862403 to
b00b685
Compare
|
Test build #81251 has finished for PR 19077 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, from my understanding the size of MemoryBlock is always the actual size, not the aligned size, so looks like we dont need to reset the size here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, MemoryBlock is always the actual size, if we reuse the previous memory,we should use the actual size to modify the size of MemoryBlock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got it, thanks for the explanation.
|
Can you please add some unit test to verify your changes. |
|
@jerryshao Thanks,i will add unit tests. |
fc8b895 to
ba5717e
Compare
|
Test build #81272 has finished for PR 19077 at commit
|
|
Test build #81273 has finished for PR 19077 at commit
|
|
This PR generally looks fine to me, my concern is that will this change bring in subtle impact on the code which leverage it. CC @JoshRosen to take a review. |
|
Just curious: do you know where are we allocating these close-in-size chunks of memory? I understand the motivation, but just curious to know what's causing this pattern. I think the original idea here was that most allocations would come from a small set of sizes (usually the page size, or a configurable buffer size) and would not generally be arbitrary sized allocations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to use ByteAraryMethods.roundNumberOfBytesToNearestWord for this, which we'e done for similar rounding elsewhere. Makes it a bit easier to spot what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the type of input parameter for roundNumberOfBytesToNearestWord is int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make the method to tackle long values.
|
@jerryshao @JoshRosen yes, it would not generally be arbitrary sized allocations. Basically, we allocate memory in multiples of 4 or 8 bytes,even so, I think this change is also beneficial . |
ba5717e to
729df24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe minor but some small allocations will be counted for pooling mechanism but they are not before, e.g. POOLING_THRESHOLD_BYTES - 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah,I think it's acceptable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is dangerous to reset to a invalid size. We should add a check here or put a WARNING in the method comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks,i will add a check.
729df24 to
0c6647c
Compare
|
Test build #81491 has finished for PR 19077 at commit
|
|
Test build #81492 has finished for PR 19077 at commit
|
|
retest this please |
|
Test build #81508 has finished for PR 19077 at commit
|
|
gentle ping @jerryshao for review |
jerryshao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change itself looks fine to me. However, I'm not sure if there's any potential impact on the code which relies on it, hopes someone could take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: why don't we inline this instead of creating a new variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The size of this line is larger than 200 bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really inline that.
…MemoryAllocator` apache#19077 In HeapMemoryAllocator, when allocating memory from pool, and the key of pool is memory size. Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes. In this case, we can improve memory reuse.
PR-apache#19077 introduced a Java style error (too long line). Quick fix.
What changes were proposed in this pull request?
In
HeapMemoryAllocator, when allocating memory from pool, and the key of pool is memory size.Actually some size of memory ,such as 1025bytes,1026bytes,......1032bytes, we can think they are the same,because we allocate memory in multiples of 8 bytes.
In this case, we can improve memory reuse.
How was this patch tested?
Existing tests and added unit tests