Skip to content

Cluster state delay can cause endless index request loop #12573

@brwe

Description

@brwe

When a primary is relocating from node_1 to node_2, there can be a short time where the old primary is removed from the node already (closed, not deleted) but the new primary is still in POST_RECOVERY. In this state indexing requests might be sent back and forth between node_1 and node_2 endlessly.

Course of events:

  1. primary ([index][0]) relocates from node_1 to node_2

  2. node_2 is done recovering, moves its shard to IndexShardState.POST_RECOVERY and sends a message to master that the shard is ShardRoutingState.STARTED

    Cluster state 1: 
    node_1: [index][0] RELOCATING (ShardRoutingState), (STARTED from IndexShardState perspective on node_1) 
    node_2: [index][0] INITIALIZING (ShardRoutingState), (at this point already POST_RECOVERY from IndexShardState perspective on node_2) 
    
  3. master receives shard started and updates cluster state to:

    Cluster state 2: 
    node_1: [index][0] no shard 
    node_2: [index][0] STARTED (ShardRoutingState), (at this point still in POST_RECOVERY from IndexShardState perspective on node_2) 
    

    master sends this to node_1 and node_2

  4. node_1 receives the new cluster state and removes its shard because it is not allocated on node_1 anymore

  5. index a document

At this point node_1 is already on cluster state 2 and does not have the shard anymore so it forwards the request to node_2. But node_2 is behind with cluster state processing, is still on cluster state 1 and therefore has the shard in IndexShardState.POST_RECOVERY and thinks node_1 has the primary. So it will send the request back to node_1. This goes on until either node_2 finally catches up and processes cluster state 2 or both nodes OOM.

I will make a pull request with a test shortly

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.>bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions