-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
#18467 was intended to limit how many times shard allocation was attempted before giving up, instead of constantly retrying. eg if an analyzer requires a local synonyms file, which is missing, then we should stop trying until the situation is resolved.
Unfortunately it doesn't work for unassigned primary shards because of the way allocation works today. For instance, a primary shard might be sitting on a disk which is over the high watermark, or it may have allocation filtering which prevents it from being allocated to the node where it already exists. Today, we just go ahead and try to assign the primary regardless of the allocation deciders (which means that we also don't limit retries to 5).
Instead, we could add extra logic to the appropriate deciders to say "always return YES if the shard in question is a primary which already exists on this node". The decision returned would be YES, but the explanation provided by the reroute or cluster allocation explain could include the reason this decider was ignored.
Related to #18321