Cluster state delay can cause endless index request loop

When a primary is relocating from `node_1` to `node_2`, there can be a short time where the old primary is removed from the node already (closed, not deleted) but the new primary is still in `POST_RECOVERY`. In this state indexing requests might be sent back and forth between `node_1` and `node_2` endlessly.

Course of events: 
1. primary (`[index][0]`) relocates from `node_1` to `node_2`
2. `node_2` is done recovering, moves its shard to `IndexShardState.POST_RECOVERY` and sends a message to master that the shard is `ShardRoutingState.STARTED` 
   
   ```
   Cluster state 1: 
   node_1: [index][0] RELOCATING (ShardRoutingState), (STARTED from IndexShardState perspective on node_1) 
   node_2: [index][0] INITIALIZING (ShardRoutingState), (at this point already POST_RECOVERY from IndexShardState perspective on node_2) 
   ```
3. master receives shard started and updates cluster state to: 
   
   ```
   Cluster state 2: 
   node_1: [index][0] no shard 
   node_2: [index][0] STARTED (ShardRoutingState), (at this point still in POST_RECOVERY from IndexShardState perspective on node_2) 
   ```
   
   master sends this to `node_1` and `node_2`
4. `node_1` receives the new cluster state and removes its shard because it is not allocated on `node_1` anymore 
5. index a document 

At this point `node_1` is already on cluster state 2 and does not have the shard anymore so it forwards the request to `node_2`. But `node_2` is behind with cluster state processing, is still on cluster state 1 and therefore has the shard in `IndexShardState.POST_RECOVERY` and thinks `node_1` has the primary. So it will send the request back to `node_1`. This goes on until either `node_2` finally catches up and processes cluster state 2 or both nodes OOM.

I will make a pull request with a test shortly


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cluster state delay can cause endless index request loop #12573

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cluster state delay can cause endless index request loop #12573

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions