-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Handle Bulk Requests on Write Threadpool #40866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Bulk requests can be thousands of items large and take more than O(10ms) time to handle => we should not handle them on the transport threadpool to not block select loops * relates elastic#39128 * closes elastic#39658
|
Pinging @elastic/es-distributed |
jaymode
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a workaround, LGTM, but I would not have this PR close the issue; we are just masking the problem. I think the @elastic/es-security team needs to evaluate that issue, investigate what can be done, and ultimately come to a decision. As it stands nothing will catch or prevent this type of issue from cropping up in a new action and the way security audits.
I'd definitely like for @ywelsch to chime in prior to merging.
|
@DaveCTurner this might interest you btw :) You mentioned running into issues with logging about long compute times on the IO thread. I wonder if this helps (not sure if you have some reproducer for the issues you're seeing)? |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't have a reliable reproduction of the issues I occasionally see on my CI machine. But typically I was seeing things blocked for many seconds and not obviously when it would have been processing enormous bulks.
If we are to use a different threadpool, it seems to me that the WRITE threadpool might be more appropriate for this?
I also think that this wants some benchmarking to back up the (IMO reasonable) argument that one more context switch can't possibly hurt here.
Hmm yea that seems unlikely, unless something in the bulk processing here can block for a long time (not sure what and I looked pretty extensively).
The reason I didn't go with the
This seems like a situation similar to #39286 where the benchmarking team also told us to merge and then look at the nightly performance. |
I'd argue against doing this since this action is not actually writing data; the auditing action is writing logs. Ultimately the data writes will occur on the write threadpool with the TransportShardBulkAction. We're just moving off of a network thread here to avoid blocking those on I/O and I think adding additional heavy I/O to the threads in the write threadpool could lead to unintended consequences. |
|
@jasontedor can you please summarize the thoughts from our FixIt meeting? |
|
I am not in favor of shuttling this kind of work over to the generic thread pool. The generic thread pool is large (can scale up to at least 128 threads and in some cases 512), and doesn't have a bounded queue. That means there would be effectively no backpressure here. While we have a goal to remove these limitations, we don't today (it's not easy because we can deadlock ourselves today if the generic thread pool is too small) and so I would not want to see the generic thread pool used for high throughput work like handling bulk requests. |
|
@jasontedor other than not using the generic threadpool do you have other thoughts on how you'd like to see this addressed? There is blocking I/O when security and auditing are enabled that is being done on a network thread in the case of a bulk request. |
|
How about only if security and auditing are enabled do we use a different thread pool, otherwise keep this as-is today? I would even be in favor of a dedicated security or audit specific thread pool. This will require adding some hooks. |
|
@jasontedor but wouldn't it make sense, in general, to get looping through these potentially huge requests off of the IO threads? (with all the ingest/index creation/ checks and such large requests can take quite a bit of time to process) |
|
I don't know, has it proven to be a problem today other than the audit logging case? |
One data point I could give here is that running e.g Rally's PMC track with default settings results in ~100ms average runtime for So for large bulk requests, we're definitely introducing significant latency on the |
Okay, that is significant and I agree worth addressing. As I think about this some more, I have general concerns about the blocking of audit logging, in theory this blocking on the network thread could be problematic for any request. I think we have two problems to solve here then:
So let's focus this pull request on the first problem and try to figure out which thread pool we should use. I continue to have strong reservations about using the generic thread pool for the reasons that I gave previously. In the fix-it meeting yesterday I said I would be fine with the write thread pool. I understand @jaymode concerns about that, but if I understand it correctly it's related to the concerns about audit logging which I think we should explore separately. |
|
@jasontedor sounds good :)
Already moved to that this afternoon 1b05e50 :) |
|
thanks @jasontedor @jaymode @DaveCTurner ! |
* Bulk requests can be thousands of items large and take more than O(10ms) time to handle => we should not handle them on the transport threadpool to not block select loops * relates elastic#39128 * relates elastic#39658
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for elastic#40866 which now might lead to an outright rejection of the request instead of its items individually * Fixed by adding retry functionality to the top level request as well * Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did * closes elastic#41324
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for #40866 which now might lead to an outright rejection of the request instead of its items individually * Fixed by adding retry functionality to the top level request as well * Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did * closes #41324
* Due to elastic#40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes elastic#41363
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for elastic#40866 which now might lead to an outright rejection of the request instead of its items individually * Fixed by adding retry functionality to the top level request as well * Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did * closes elastic#41324
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for #40866 which now might lead to an outright rejection of the request instead of its items individually * Fixed by adding retry functionality to the top level request as well * Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did * closes #41324
* Due to #40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes #41363
* Due to elastic#40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes elastic#41363
* Due to #40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes #41363
* Due to elastic#40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes elastic#41363
* Bulk requests can be thousands of items large and take more than O(10ms) time to handle => we should not handle them on the transport threadpool to not block select loops * relates elastic#39128 * relates elastic#39658
* The test fails for the retry backoff enabled case because the retry handler in the bulk processor hasn't been adjusted to account for elastic#40866 which now might lead to an outright rejection of the request instead of its items individually * Fixed by adding retry functionality to the top level request as well * Also fixed the duplicate test for the HLRC that wasn't handling the non-backoff case yet the same way the non-client IT did * closes elastic#41324
* Due to elastic#40866 one of the two parallel bulk requests can randomly be rejected outright when the write queue is full already, we can catch this situation and ignore it since we can still have the rejection for the dynamic mapping udate for the other reuqest and it's somewhat rare to run into this anyway * Closes elastic#41363
|
@original-brownbear are there plans to do something similar at |
|
@Bukhtawar no there's no plans like that. For the time being some initial request deserialization and handling as well as request serialization happens on the network thread and there is no plans to change that at this point (though if that's an issue in some configurations one could reasonably look into doing something like that). |
I have a corresponding heap dump which I would share, 1/3 of the heap was roughly transport requests though if that makes sense @original-brownbear Edit: Note the intent here might just be to reject early though to protect the cluster and may not necessarily help with network threads |
Revisited this when profiling to find the slowest executions on the transport thread and finding that bulk request handling takes quite some (in tests up to 200ms!) time for larger requests. I'm not sure if there's a better alternative to the generic pool, but it seemed like a safe bet?
Marking this as discuss since it's just a suggestion and there could be side effects to this that I'm missing.