-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Remove artificial default processors limit #20874
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove artificial default processors limit #20874
Conversation
Today Elasticsearch limits the number of processors used in computing thread counts to 32. This was from a time when Elasticsearch created more threads than it does now and users would run into out of memory errors. It appears the real cause of these out of memory errors was not well understood (it's often due to ulimit settings) and so users were left hitting these out of memory errors on boxes with high core counts. Today Elasticsearch creates less threads (but still a lot) and we have a bootstrap check in place to ensure that the relevant ulimit is not too low. There are some caveats still to having too many concurrent indexing threads as it can lead to too many little segments, and it's not a magical go faster knob if indexing is already bottlenecked by disk, but this limitation is artificial and surprising to users and so it should be removed.
This commit increases the lower bound of the max processes ulimit, to prepare for a world where Elasticsearch instances might be running with more the previous cap of 32 processors. With the current settings, Elasticsearch wants to create roughly 576 + 25 * p / 2 threads, where p is the number of processors. Add in roughly 7 * p / 8 threads for the GC threads and a fudge factor, and 4096 should cover us pretty well up to 256 cores.
|
I've targeted this for 5.1.0, but I think that da92de7 should be backported to 5.0.0 or users will have a rude awakening when they upgrade to 5.1.0 if they've already adjusted their ulimit to 2048. |
|
@mikemccand Are you interested in reviewing? |
|
Tracing through history, it seems the original issue was about the transport client. Maybe it makes sense to set a bounded default in the client alone? |
|
LGTM, thanks @jasontedor! |
jpountz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
How bad would it be to only push this commit to 6.0? |
It just means another year of living with this limitation. |
|
I'm not sure I was clear, but I was commenting specifically about the change to the BootstrapCheck. Said otherwise: how about removing the 32 limit in 5.1 and requiring an increase to the ulimit only in 6.0? We could potentially revisit the idea to better share thread pools in 5.1 to make the number of threads less of an issue. |
|
I think that is this stage (rc is out), we need to ask a different question - Does it really need to go into 5.0?. I think if we’re honest, the answer is no?
|
@jpountz It definitely was not clear; I spoke with @s1monw via another channel and I think that he interpreted the same way that I was interpreting it (to keep the limitation until 6.0.0).
It's worth considering. |
|
@bleskes It's not clear what you're commenting on; the processor limitation removal is not targeted for 5.0.0, just the bootstrap check change as otherwise users might have to set the ulimit twice: once on upgrade to 5.0.0, and again on upgrade to 5.1.0. |
|
What about this, we wait with boostrap checks until 6.0 and make the hardlimit a softlimit in 5.1 then users that have such a high CPU count can set it higher but it's bound by default. We can also to support users print a warning message that they should consider raising this limit. I am not a fan of changing something like this in a minor release |
I see. Then it's good in that respect. I'm still hesitant about changing the ulimit as well - I'm not sure bootstrap checks should make everyone account for the largest use case possible. One other idea I had is to take |
@bleskes Before opening this PR, I did toy with exactly this idea. It's fine, it works, but I opted exactly for the simplest approach because the default on most systems is already quite large (total number of pages / 64); any x86 system with more than 1 GB of memory will have its defaults above the limit enforced here. |
All good then! |
It's a soft limit today. We discussed this during Fix-it-Friday and agreed to push this to 6.0.0 only since there is a workaround for the limit in the 5.x series already (see #20895). |
This commit adjusts the expectation for the max number of threads in the scaling thread pool configuration test. The reason that this expectation is incorrect is because we removed the limitation that the number of processors maxes out at 32, instead letting it be the true number of logical processors on the machine. However, when we removed this limitation, this test was never adjusted to reflect the new reality yet it never arose since our tests were not running on machines with incredibly high core counts. Relates #20874
This commit adjusts the expectation for the max number of threads in the scaling thread pool configuration test. The reason that this expectation is incorrect is because we removed the limitation that the number of processors maxes out at 32, instead letting it be the true number of logical processors on the machine. However, when we removed this limitation, this test was never adjusted to reflect the new reality yet it never arose since our tests were not running on machines with incredibly high core counts. Relates #20874
Today Elasticsearch limits the number of processors used in computing
thread counts to 32. This was from a time when Elasticsearch created
more threads than it does now and users would run into out of memory
errors. It appears the real cause of these out of memory errors was not
well understood (it's often due to ulimit settings) and so users were
left hitting these out of memory errors on boxes with high core
counts. Today Elasticsearch creates less threads (but still a lot) and
we have a bootstrap check in place to ensure that the relevant ulimit is
not too low.
There are some caveats still to having too many concurrent indexing
threads as it can lead to too many little segments, and it's not a
magical go faster knob if indexing is already bottlenecked by disk, but
this limitation is artificial and surprising to users and so it should
be removed.
Closes #20828