-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Remove node-level canAllocate override #59389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove node-level canAllocate override #59389
Conversation
Today there is a node-level `canAllocate` override which the balancer uses to ignore certain nodes to which it is certain no more shards can be allocated. In fact this override only ignores nodes which have hit the rarely-used `cluster.routing.allocation.total_shards_per_node` limit, so this optimization doesn't have a meaningful impact on real clusters. This commit removes this unnecessary fast path from the balancer, and also removes all the machinery needed to support it.
|
Pinging @elastic/es-distributed (:Distributed/Allocation) |
|
@zuketo do we have any data that can show how rarely |
|
I couldn't find any good data sources for this (other than adding to telemetry). This setting is also not whitelisted by cloud, so no data points there. Could we deprecate first and then look at removal? |
|
Thanks for confirming, Jason. TBC we're not talking about removing the setting itself, only the 100 lines of code that treats this setting as a special case in the shard allocator. Let's see what the team discussion brings. |
|
Absent a better source of data I looked through as many user interactions as I could find and only encountered 7 mentions of this setting in the last 90 days, and I don't think any of them would have benefitted from this optimisation. We discussed this today as well and agreed to proceed. |
ywelsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Today there is a node-level `canAllocate` override which the balancer uses to ignore certain nodes to which it is certain no more shards can be allocated. In fact this override only ignores nodes which have hit the rarely-used `cluster.routing.allocation.total_shards_per_node` limit, so this optimization doesn't have a meaningful impact on real clusters. This commit removes this unnecessary fast path from the balancer, and also removes all the machinery needed to support it.
Today there is a node-level
canAllocateoverride which the balanceruses to ignore certain nodes to which it is certain no more shards can
be allocated. In fact this override only ignores nodes which have hit
the rarely-used
cluster.routing.allocation.total_shards_per_nodelimit, so this optimization doesn't have a meaningful impact on real
clusters.
This commit removes this unnecessary fast path from the balancer, and
also removes all the machinery needed to support it.