Skip to content

Primary shard recovery time slower in 6.3.2 than 5.6.7 #33198

@ahadadi

Description

@ahadadi

Elasticsearch version (6.3.2)

Description of the problem including expected versus actual behavior:
When upgrading from 5.6.7 to 6.3.2, we've noticed that relocating a primary shard takes longer.
It seems that in 6.3.2, when relocating a primary shard to a different node, translog operations are being replayed. This happens even if the shard on the source node was successfully flushed, which means the translog does not contain any operation that is not already contained in the files being copied to the target node.
In 5.6.7 the translog is emptied when flush takes place, so translog operations are not replayed during relocation.

The expected behavior is that the recovery of a flushed shard to an empty target node will not entail translog replay, only copying files.

Steps to reproduce:

  1. Create an index with a single primary shard and no replicas.
  2. Index 1M documents.
  3. Flush the index.
  4. Relocate the index to a different node using e.g. "index.routing.allocation.require._name" with a different node name.

If you set org.elasticsearch.indices.recovery logging level to TRACE, you will see that a file based recovery is taking place, that the files are transferred and then the translog is being sent and replayed on the target node:
[2018-08-28T13:10:31,099][TRACE][o.e.i.r.RecoverySourceHandler] [node_td2] [index][0][recover to node_td1] sent batch of [10083][512kb] (total: [1000000]) translog operations

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Indexing/RecoveryAnything around constructing a new shard, either from a local or a remote source.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions