-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Control Cluster Shards Balancing #42739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-distributed |
ywelsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your interest in contributing to ES. I don't understand what this change is trying to achieve, unfortunately. Can you provide some more explanations? In particular, why is it removing the code that allows index-level settings from overriding the cluster-level ones?
|
If an index is allowed to be rebalance, then this line of code "if (allocation. deciders (). canRebalance (allocation). type ()!= Type. YES)" returns false, and the code continues to execute downward. In the balanceByWeights () method, all indexes are traversed according to the weight rebalance, so I set cluster. routing. rebalance. enable: "none" is invalid. |
ywelsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what the purpose of this PR is, especially as it's disabling existing ES functionality. Is it to address a scaling issue? You mention a cluster with 300,000 shards on another PR, so I assume this is related. Why is this cluster making use of index-level rebalancing (i.e. the index.routing.rebalance.enable setting) if that is turning out to be a scaling bottleneck?
| return allocation.decision(Decision.NO, NAME, "none rebalance are not allowed"); | ||
| case PRIMARIES: | ||
| if (allocation.routingNodes().hasInactivePrimaries()) { | ||
| return allocation.decision(Decision.NO, NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this globally disable rebalancing when there are some inactive primaries of an unrelated index?
| return allocation.decision(Decision.YES, NAME, "all primary shards is active and rebalance are allowed"); | ||
| case REPLICAS: | ||
| if (allocation.routingNodes().hasInactiveShards()) { | ||
| return allocation.decision(Decision.NO, NAME, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does this globally disable rebalancing when there are inactive shards of some unrelated index?
| case ALL: | ||
| return allocation.decision(Decision.YES, NAME, "all rebalance are allowed"); | ||
| case NONE: | ||
| return allocation.decision(Decision.NO, NAME, "none rebalance are not allowed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said earlier, this changes the behavior of ES not to take the index-level property into account anymore when the cluster-level property is set.
Previous implementations were well thought out, but an index balance would cause all indexes to balance together.