-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Distributed Coordination/AllocationAll issues relating to the decision making around placing a shard (both master logic & on the nodes)All issues relating to the decision making around placing a shard (both master logic & on the nodes)>enhancementhigh hanging fruit
Description
When you have an index with index.auto_expand_replicas=0-all running on 3 nodes and you bring down one node the number of replicas will be reduced by the master from 2 to 1. Then when the node that just went down comes up again ElasticSearch on that node will will:
- Go up, notice that the number of replicas for that index is 1, and promptly drop its own data as redundant
- The master will notice that it has a new node in the cluster, set the number of replicas to 2.
- The node that just dropped its data will now have the data it just dropped re-synced to it.
Instead ElasticSearch should:
- Go up, wait for the master to adjust the number of replicas if needed
- Only after that's done drop anything, if needed.
- Not re-sync any data since it didn't drop the data in the brief interim when the master was adjusting the number of replicas from 1 to 2.
This'll aid recovery time where you have a setup where a relatively small index is available on all the nodes for capacity reasons, and you bring up a new node that should serve search requests right away.
Metadata
Metadata
Assignees
Labels
:Distributed Coordination/AllocationAll issues relating to the decision making around placing a shard (both master logic & on the nodes)All issues relating to the decision making around placing a shard (both master logic & on the nodes)>enhancementhigh hanging fruit