Fill missing sequence IDs up to max sequence ID when recovering from store #24238

s1monw · 2017-04-21T09:29:11Z

Today we might promote a primary and recover from store where after translog
recovery the local checkpoint is still behind the maximum sequence ID seen.
To fill the holes in the sequence ID history this PR adds a utility method
that fills up all missing sequence IDs up to the maximum seen sequence ID
with no-ops.

Relates to #10708
I still work on a test for store recovery to ensure it's called but I think it's ready for review.

…store Today we might promote a primary and recover from store where after translog recovery the local checkpoint is still behind the maximum sequence ID seen. To fill the holes in the sequence ID history this PR adds a utility method that fills up all missing sequence IDs up to the maximum seen sequence ID with no-ops.

bleskes

Looks great. I one minor request to the test.

bleskes · 2017-04-21T11:59:38Z

core/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

+            final long maxSeqId = seqNoService.getMaxSeqNo();
+            int numNoOpsAdded = 0;
+            for (long i = localCheckpoint + 1; i <= maxSeqId;
+                 // the local checkpoint might have been advanced so we are leap-frogging


the local checkpoint must have advanced by at least one. We can assert on that after the noop was indexed.

bleskes · 2017-04-21T12:20:42Z

core/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java

+                Engine.Index primaryResponse = indexForDoc(doc);
+                Engine.IndexResult indexResult = engine.index(primaryResponse);
+                if (randomBoolean()) {
+                    doc.updateSeqID(indexResult.getSeqNo(), 1);


why is this needed? doesn't the engine take care of that?

bleskes · 2017-04-21T12:23:53Z

core/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java

+            assertEquals((maxSeqIDOnReplica+1) - numDocsOnReplica, recoveringEngine.fillSequenceNumberHistory(2));
+            assertEquals(maxSeqIDOnReplica, recoveringEngine.seqNoService().getMaxSeqNo());
+            assertEquals(maxSeqIDOnReplica, recoveringEngine.seqNoService().getLocalCheckpoint());
+            if ((flushed = randomBoolean())) {


can we snapshot the translog and assert that the noops have the right primary term?

ah I had that but remvoed it... good catch...

bleskes

Still LGTM. Left a suggestion for the new test.

bleskes · 2017-04-21T15:33:54Z

core/src/test/java/org/elasticsearch/index/shard/IndexShardTests.java

+        // start a replica shard and index the second doc
+        final IndexShard otherShard = newStartedShard(false);
+        test = otherShard.prepareIndexOnReplica(
+            SourceToParse.source(SourceToParse.Origin.PRIMARY, shard.shardId().getIndexName(), test.type(), test.id(), test.source(),


Origin should REPLICA

bleskes · 2017-04-21T15:37:30Z

core/src/test/java/org/elasticsearch/index/shard/IndexShardTests.java


+    /* This test just verifies that we fill up local checkpoint up to max seen seqID on primary recovery */
+    public void testRecoverFromStoreWithNoOps() throws IOException {
+        final IndexShard shard = newStartedShard(true);


I think we can introduce a variant of indexDoc called indexDocOnReplica which takes a seq# as a parameter. This will remove the need for the extra shard. wdyt?

I can do that in a sep PR

…store (elastic#24238) Today we might promote a primary and recover from store where after translog recovery the local checkpoint is still behind the maximum sequence ID seen. To fill the holes in the sequence ID history this PR adds a utility method that fills up all missing sequence IDs up to the maximum seen sequence ID with no-ops. Relates to elastic#10708

s1monw added :Engine >enhancement v6.0.0-alpha1 labels Apr 21, 2017

s1monw requested review from bleskes and jasontedor April 21, 2017 09:29

fix compile errors

4f8b4c5

bleskes suggested changes Apr 21, 2017

View reviewed changes

apply feedback

09fff06

bleskes approved these changes Apr 21, 2017

View reviewed changes

add a test to check if we fill up on store recovery

28edc5c

bleskes approved these changes Apr 21, 2017

View reviewed changes

apply feedback

1beb757

s1monw merged commit 2ca7072 into elastic:master Apr 21, 2017

s1monw deleted the fill_gaps_on_tlog_recovery branch April 21, 2017 18:28

s1monw removed the review label Apr 21, 2017

jasontedor mentioned this pull request May 7, 2017

Remove gap skipping when opening engine #24535

Merged

bleskes mentioned this pull request May 9, 2017

Add Sequence Numbers to write operations #10708

Closed

64 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fill missing sequence IDs up to max sequence ID when recovering from store #24238

Fill missing sequence IDs up to max sequence ID when recovering from store #24238

Uh oh!

s1monw commented Apr 21, 2017 •

edited

Loading

Uh oh!

bleskes left a comment

Uh oh!

bleskes Apr 21, 2017

Uh oh!

bleskes Apr 21, 2017

Uh oh!

bleskes Apr 21, 2017

Uh oh!

s1monw Apr 21, 2017

Uh oh!

bleskes left a comment

Uh oh!

bleskes Apr 21, 2017

Uh oh!

bleskes Apr 21, 2017

Uh oh!

s1monw Apr 21, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fill missing sequence IDs up to max sequence ID when recovering from store #24238

Fill missing sequence IDs up to max sequence ID when recovering from store #24238

Uh oh!

Conversation

s1monw commented Apr 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bleskes left a comment

Choose a reason for hiding this comment

Uh oh!

bleskes Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

bleskes Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

bleskes Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

s1monw Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

bleskes left a comment

Choose a reason for hiding this comment

Uh oh!

bleskes Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

bleskes Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

s1monw Apr 21, 2017

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

s1monw commented Apr 21, 2017 •

edited

Loading