-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Data Management/ILM+SLMIndex and Snapshot lifecycle managementIndex and Snapshot lifecycle management>bug
Description
Consider the following scenario:
An index with at least 1 replica is just about to start its Shrink step, so it does the following:
- sets the index to read-only
- sets the index to be allocated only on
node_id:123XYZ - waits for a copy of each shard on
node_id:123XYZ - performs the shrink step
- etc
If, after accomplishing step 2, but before step 3 is done, the user restarts the cluster, when the cluster comes back up, due to the allocation rule, the replicas for the index will not be allowed to be allocated because of the _id filtering performed in step 2. This leads the check in step 3 never to pass due to the check at:
Lines 56 to 60 in ec53288
| if (ActiveShardCount.ALL.enoughShardsActive(clusterState, index.getName()) == false) { | |
| logger.debug("[{}] shrink action for [{}] cannot make progress because not all shards are active", | |
| getKey().getAction(), index.getName()); | |
| return new Result(false, new CheckShrinkReadyStep.Info("", expectedShardCount, -1)); | |
| } |
And a perpetual error step op:
"test-000039" : {
"step" : "check-shrink-allocation",
"step_time" : "2018-11-06T22:54:39.805Z",
"step_time_millis" : 1541544879805,
"step_info" : {
"message" : "Waiting for all shards to become active",
"node_id" : "",
"shards_left_to_allocate" : -1,
"expected_shards" : 2
}
},Since shrink does not require all copies of the shard to be active, we should remove this check
Metadata
Metadata
Assignees
Labels
:Data Management/ILM+SLMIndex and Snapshot lifecycle managementIndex and Snapshot lifecycle management>bug