Waiting for all shards to be active after a cluster restart may never be possible for a shrink step

Consider the following scenario:

An index with at least 1 replica is just about to start its Shrink step, so it does the following:

1. sets the index to read-only
2. sets the index to be allocated only on `node_id:123XYZ`
3. waits for a copy of each shard on `node_id:123XYZ`
4. performs the shrink step
5. etc

If, after accomplishing step 2, but before step 3 is done, the user restarts the cluster, when the cluster comes back up, due to the allocation rule, the replicas for the index will not be allowed to be allocated because of the `_id` filtering performed in step 2. This leads the check in step 3 never to pass due to the check at:

https://github.com/elastic/elasticsearch/blob/ec53288fc0c94c4f514b24b2230671c7ec0316ed/x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/indexlifecycle/CheckShrinkReadyStep.java#L56-L60

And a perpetual error step op:

```json
    "test-000039" : {
      "step" : "check-shrink-allocation",
      "step_time" : "2018-11-06T22:54:39.805Z",
      "step_time_millis" : 1541544879805,
      "step_info" : {
        "message" : "Waiting for all shards to become active",
        "node_id" : "",
        "shards_left_to_allocate" : -1,
        "expected_shards" : 2
      }
    },
```

Since shrink does not require all copies of the shard to be active, we should remove this check

	if (ActiveShardCount.ALL.enoughShardsActive(clusterState, index.getName()) == false) {
	logger.debug("[{}] shrink action for [{}] cannot make progress because not all shards are active",
	getKey().getAction(), index.getName());
	return new Result(false, new CheckShrinkReadyStep.Info("", expectedShardCount, -1));
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Waiting for all shards to be active after a cluster restart may never be possible for a shrink step #35321

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Waiting for all shards to be active after a cluster restart may never be possible for a shrink step #35321

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions