-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Elasticsearch version (6.3.2)
Description of the problem including expected versus actual behavior:
When upgrading from 5.6.7 to 6.3.2, we've noticed that relocating a primary shard takes longer.
It seems that in 6.3.2, when relocating a primary shard to a different node, translog operations are being replayed. This happens even if the shard on the source node was successfully flushed, which means the translog does not contain any operation that is not already contained in the files being copied to the target node.
In 5.6.7 the translog is emptied when flush takes place, so translog operations are not replayed during relocation.
The expected behavior is that the recovery of a flushed shard to an empty target node will not entail translog replay, only copying files.
Steps to reproduce:
- Create an index with a single primary shard and no replicas.
- Index 1M documents.
- Flush the index.
- Relocate the index to a different node using e.g. "index.routing.allocation.require._name" with a different node name.
If you set org.elasticsearch.indices.recovery logging level to TRACE, you will see that a file based recovery is taking place, that the files are transferred and then the translog is being sent and replayed on the target node:
[2018-08-28T13:10:31,099][TRACE][o.e.i.r.RecoverySourceHandler] [node_td2] [index][0][recover to node_td1] sent batch of [10083][512kb] (total: [1000000]) translog operations