-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Open
Labels
:Distributed Coordination/AllocationAll issues relating to the decision making around placing a shard (both master logic & on the nodes)All issues relating to the decision making around placing a shard (both master logic & on the nodes)>enhancementTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Description
Description
cluster.routing.allocation.cluster_concurrent_rebalance property is limiting the amount of shards that could be rebalanced simultaneously. The default value is 2 what is reasonable for a small amount of shards however it is becoming a bottleneck for a bigger clusters (10+ nodes).
Since new desired balance shard allocator is not affected by #87279 (effectively resolved by #93977) I believe we should change the default to allow big clusters to rebalance quicker.
The new default could be set to:
- 10 (or any other higher arbitrary number). This will not resolve the issue completely but will move the bottleneck a little further
- Make it dependent on the cluster size (for example allow 1 concurrent rebalance per every 2 nodes in cluster ro introduce a new setting such as
cluster.routing.allocation.node_concurrent_recoveries_per_node). This approach will allow to scale the number with the cluster size - -1 (or unlimited). This way the bottleneck would be defined by amount of incomming/outgoing recoveries the node could sustain:
cluster.routing.allocation.node_concurrent_incoming_recoveries/cluster.routing.allocation.node_concurrent_outgoing_recoveries. This is the most aggresive option and it may delay the necessary shard movements (such as hot->warm tier migration) due to already ongoing rebalances.
Metadata
Metadata
Assignees
Labels
:Distributed Coordination/AllocationAll issues relating to the decision making around placing a shard (both master logic & on the nodes)All issues relating to the decision making around placing a shard (both master logic & on the nodes)>enhancementTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.