Skip to content

Conversation

@jasontedor
Copy link
Member

When a primary is relocated from an old node to a new node, it can have ops in its translog that do not have a sequence number assigned. When a file-based recovery is started, this can lead to skipping these ops when replaying the translog due to a bug in the recovery logic. This commit addresses this bug and adds a test in the BWC tests.

Relates #22484

When a primary is relocated from an old node to a new node, it can have
ops in its translog that do not have a sequence number assigned. When a
file-based recovery is started, this can lead to skipping these ops when
replaying the translog due to a bug in the recovery logic. This commit
addresses this bug and adds a test in the BWC tests.
Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. It would be great to have tests added in RecoverySourceHandlerTests .

assertOK(response);
final InputStream content = response.getEntity().getContent();
final int actualCount =
Integer.parseInt(XContentHelper.convertToMap(JsonXContent.jsonXContent, content, false).get("count").toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance to use something like ObjectPath.evaluate(shard, "seq_no.local_checkpoint")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 4430caa.


logger.info("allowing shards on all nodes");
updateIndexSetting(index, Settings.builder().putNull("index.routing.allocation.include._name"));
ensureGreen();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we add assert counts here too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 0112f90.

@jasontedor
Copy link
Member Author

jasontedor commented Feb 2, 2017

@bleskes I pushed a unit tests in bb8884a.

@jasontedor
Copy link
Member Author

Thanks @bleskes. I've pushed commits in response to your comments.

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome

@bleskes
Copy link
Contributor

bleskes commented Feb 3, 2017

test this please

1 similar comment
@bleskes
Copy link
Contributor

bleskes commented Feb 3, 2017

test this please

@jasontedor jasontedor merged commit 6e99402 into elastic:master Feb 3, 2017
@jasontedor jasontedor deleted the file-based-lost-ops branch February 3, 2017 13:12
@jasontedor
Copy link
Member Author

Thanks @bleskes.

@clintongormley clintongormley added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants