-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Stop Copying Every Http Request in Message Handler #44564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop Copying Every Http Request in Message Handler #44564
Conversation
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way * Relates elastic#32228 * I think the issue that preventet that PR that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer) * I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
|
Pinging @elastic/es-distributed |
|
This PR relies on the fact that all REST actions will be done with the request content by the time they send a response. I'm not sure that this is a safe assumption. |
|
@tbrooks8 appears there's only a single spot where this doesn't hold: https://github.com/elastic/elasticsearch/pull/44564/files#diff-05216fd3df413bda2df9486bce7a4e29R51 I would also argue that holding on to these bytes without copying is something that shouldn't be happening implicitly anywhere. Otherwise the whole memory behavior of the REST layer becomes pretty unpredictable. In the case of storing the pipeline source it's ok to use the request body for that as a special case, but if we generally allow request contents to be referenced beyond responding to the request the whole notion of the circuit breaker becomes wrong imo. |
|
Jenkins run elasticsearch-ci/packaging-sample |
server/src/main/java/org/elasticsearch/rest/action/ingest/RestPutPipelineAction.java
Outdated
Show resolved
Hide resolved
|
This probably requires a team discuss. |
|
Sure, added |
|
Putting this back into |
|
Thanks Yannick. I simplified the handling of Not sure about a test for the unpooled buffer case, given the effort required for that. Maybe we could do so (add tricky tests) in a follow-up and use the logic from #44881 that allows for more wholistically optimizing the |
ywelsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's have @tbrooks8 ok this again as well.
|
Link #49699 which shows OOM from the copies avoided here most likely |
|
Ping @tbrooks8 There shouldn't be much left to do here here since the changes from the last time you looked at it are minor :) Thanks! |
Tim-Brooks
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I missed this in the last review cycle, but I don't understand why we completely reverted this for NIO?
I thought you were going the direction of always copying instead of doing the cast optimization.
I think we should still using the existing buffer when allowsUnsafeBuffers return true. It's just that we should not attempt to do the non-copy when allowsUnsafeBuffers return false and we think the buffer is unpooled (since we do no reliably know that).
I guess we can bring NIO back in a follow-up so I'll approve this.
|
@tbrooks8 thanks!
🤦♂️ ... I completely misread your comment on the NIO situation back in the intial review ... will open the follow up tomorrow :) |
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way * Relates elastic#32228 * I think the issue that preventet that PR that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer) * I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
Same as elastic#44564 but for NIO.
|
NIO version in #49819 :) |
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way * Relates #32228 * I think the issue that preventet that PR that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer) * I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way * Relates elastic#32228 * I think the issue that preventet that PR that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks (I'd argue the bounds checks we add, we save when copying the composite buffer) * I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
Same as #44564 but for NIO.
Same as elastic#44564 but for NIO.
BytesReferencein a way that makes it referenced after the request was responded to. We can simply release it once the response has been generated and a lot ofUnpooledallocations that way for many if not most requests. For now, this PR makes it so search- and bulk requests are not copied toUnpooledbuffers.