Replicate max seq_no of updates to replicas #33967

dnhatn · 2018-09-22T12:22:02Z

We start tracking max seq_no_of_updates on the primary in #33842. This commit replicates that value from a primary to its replicas in replication requests or the translog phase of peer-recovery.

With this change, we guarantee that the value of max seq_no_of_updates on a replica when any index/delete operation is performed at least the max_seq_no_of_updates on the primary when that operation was executed.

Relates #33656

We start tracking max seq_no_of_updates on the primary in elastic#33842. This commit replicates that value from a primary to its replicas in replication requests or the translog phase of peer-recovery. With this change, we guarantee that the value of max seq_no_of_updates on a replica when any index/delete operation is performed at least the max_seq_no_of_updates on the primary when that operation was executed. Relates elastic#33656

elasticmachine · 2018-09-22T12:22:03Z

Pinging @elastic/es-distributed

bleskes

Looks great. I left some comments

bleskes · 2018-09-22T17:58:43Z

server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

+    private boolean assertMaxSeqNoOfUpdatesIsPropagated(Index index, IndexingStrategy plan) {
+        final long maxSeqNoOfUpdates = getMaxSeqNoOfUpdatesOrDeletes();
+        final Version indexVersion = config().getIndexSettings().getIndexVersionCreated();
+        assert plan.useLuceneUpdateDocument == false


I'm not sure what this buys us? it just validates the logic in plan as non primary and has nothing to do with the replication code? what am I missing?

Yeah, but with our classic example: index-1, delete-2, and index-3 on the primary; and the order on a replica is index-1(msu=-1), index-3(msu=2), and delete-2(msu=2). An index-3 on replica is an update on a replica but its msu is only 2.

sorry, I don't follow. I'll reach out monday to discuss.

bleskes · 2018-09-22T18:00:28Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

+                                    // If the old primary was on an old version, this promoting primary (was replica before)
+                                    // does not have max_seq_no_of_updates. We need to bootstrap it manually from its local history.
+                                    assert indexSettings.getIndexVersionCreated().before(Version.V_7_0_0_alpha1);
                                    engine.initializeMaxSeqNoOfUpdatesOrDeletes();


I think we need maxSeq no here? maybe the translog wasn't fsynced? (if people disable the per request fsync)

Yes, initializeMaxSeqNoOfUpdatesOrDeletes uses the max_seq_no from the local tracker and translog.

bleskes · 2018-09-22T18:02:29Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

-                // assert indexSettings.getIndexVersionCreated().before(Version.V_7_0_0_alpha1) : indexSettings.getIndexVersionCreated();
+                // If the old primary was on an old version, this promoting primary (was replica before)
+                // does not have max_seq_no_of_updates. We need to bootstrap it manually from its local history.
+                assert indexSettings.getIndexVersionCreated().before(Version.V_7_0_0_alpha1);


same comment

bleskes · 2018-09-22T18:04:27Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

                                             opPrimaryTerm, currentGlobalCheckpoint, maxSeqNo);
                                if (currentGlobalCheckpoint < maxSeqNo) {
-                                    resetEngineToGlobalCheckpoint();
+                                    resetEngineToGlobalCheckpoint(maxSeqNoOfUpdatesOrDeletes);


why do we need to set the maxSeqNoUpdatesOrDeletes here? can't we let it be what engine does it self? I think it's also risky because we recovery from the translog (and need make the optimization doesn't create duplicates in thas process)

Yes, we can let an engine take care itself. The reason I use the max_seq_no_of_updates from the primary, so at the end of a test, all replicas and its primary have the same max_seq_no_of_updates. I will update this.

bleskes · 2018-09-22T18:09:54Z

test/framework/src/main/java/org/elasticsearch/index/engine/InternalTestEngine.java

+            });
+        }
+        final IndexResult result = super.index(index);
+        if (index.seqNo() == SequenceNumbers.UNASSIGNED_SEQ_NO && result.getFailure() == null && result.isCreated() == false) {


I think we can inline this into internal engine as a standard assertion no?

bleskes · 2018-09-22T18:10:24Z

test/framework/src/main/java/org/elasticsearch/index/engine/InternalTestEngine.java

+            advanceMaxSeqNoOfUpdatesOrDeletes(maxSeqNo);
+        }
+        final DeleteResult result = super.delete(delete);
+        if (delete.seqNo() == SequenceNumbers.UNASSIGNED_SEQ_NO && result.getFailure() == null) {


same comment this can be in the engine it self?

bleskes · 2018-09-22T18:11:58Z

test/framework/src/main/java/org/elasticsearch/index/engine/InternalTestEngine.java

+ * An alternative of {@link InternalEngine} that allows tweaking internals to reduce noise in engine tests.
+ */
+class InternalTestEngine extends InternalEngine {
+    private volatile boolean autoAdjustMaxSeqNoOfUpdatesOrDeletes = true;


doesn't like anyone is change this - is this a follow up?

dnhatn · 2018-09-22T22:18:09Z

@bleskes Could you please have another look? Thank you!

bleskes

Looks good. I left some nits. I'm waiting with LGTM for our discussion

bleskes · 2018-09-23T11:01:45Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

+                                    // If the old primary was on an old version, this promoting primary (was replica before)
+                                    // does not have max_seq_no_of_updates. We need to bootstrap it manually from its local history.
+                                    assert indexSettings.getIndexVersionCreated().before(Version.V_7_0_0_alpha1);
+                                    engine.advanceMaxSeqNoOfUpdatesOrDeletes(seqNoStats().getMaxSeqNo());


reads better thanks. I think this comment will be more intuitive if you change old primary was on an old version to old primary was on an old version that didn't yet replicate the MSU, we need to bootstrap it...

bleskes · 2018-09-23T11:04:20Z

server/src/main/java/org/elasticsearch/index/shard/IndexShard.java

            newEngine = createNewEngine(newEngineConfig());
            active.set(true);
        }
+        newEngine.advanceMaxSeqNoOfUpdatesOrDeletes(seqNoStats.getMaxSeqNo());


I think it will be easier to understand if we use seqNoStats.getGlobalCheckpoint() which is the upper bound for the local recovery. The rest we don't care about here, right?

bleskes · 2018-09-23T11:07:08Z

server/src/main/java/org/elasticsearch/index/engine/InternalEngine.java

                } else if (plan.indexIntoLucene || plan.addStaleOpToLucene) {
                    indexResult = indexIntoLucene(index, plan);
+                    assert (index.seqNo() != SequenceNumbers.UNASSIGNED_SEQ_NO || indexResult.isCreated() || indexResult.getFailure() != null)
+                        || indexResult.getSeqNo() <= getMaxSeqNoOfUpdatesOrDeletes() : indexResult.getSeqNo() + " > " + getMaxSeqNoOfUpdatesOrDeletes();


we need to allow for the msu to be unset here, no?

dnhatn · 2018-09-25T01:08:10Z

@bleskes I've updated the assertion as we discussed. Would you please give it another look? Thank you.

bleskes

Lgtm

s1monw

LGTM 2 thanks @dnhatn

dnhatn · 2018-09-25T12:07:23Z

Thanks @bleskes and @s1monw.

We start tracking max seq_no_of_updates on the primary in #33842. This commit replicates that value from a primary to its replicas in replication requests or the translog phase of peer-recovery. With this change, we guarantee that the value of max seq_no_of_updates on a replica when any index/delete operation is performed at least the max_seq_no_of_updates on the primary when that operation was executed. Relates #33656

Relates #33967 Relates #33842

This commit adds "engine is closed" as an expected failure message. This change is due to #33967 in which we might access a closed engine on promotion. Relates #33967

We start tracking max seq_no_of_updates on the primary in #33842. This commit replicates that value from a primary to its replicas in replication requests or the translog phase of peer-recovery. With this change, we guarantee that the value of max seq_no_of_updates on a replica when any index/delete operation is performed at least the max_seq_no_of_updates on the primary when that operation was executed. Relates #33656

Relates #33967 Relates #33842

This commit adds "engine is closed" as an expected failure message. This change is due to #33967 in which we might access a closed engine on promotion. Relates #33967

dnhatn added >enhancement :Distributed Indexing/Recovery Anything around constructing a new shard, either from a local or a remote source. v7.0.0 v6.5.0 labels Sep 22, 2018

dnhatn requested review from bleskes, s1monw and ywelsch September 22, 2018 12:22

bleskes suggested changes Sep 22, 2018

View reviewed changes

dnhatn added 2 commits September 22, 2018 18:15

boaz’s feedback

84257ae

Merge branch 'master' into replica-msu

18c7eac

dnhatn requested a review from bleskes September 22, 2018 22:18

dnhatn added the review label Sep 22, 2018

bleskes reviewed Sep 23, 2018

View reviewed changes

dnhatn added 6 commits September 23, 2018 22:26

wording

16ae5f1

Merge branch 'master' into replica-msu

18813cc

fix test

e380684

Merge branch 'master' into replica-msu

9c3f902

assert when update or delete doc

c6fa2dc

relax engine assertion during promotion

f6a6242

dnhatn requested a review from bleskes September 25, 2018 01:08

bleskes approved these changes Sep 25, 2018

View reviewed changes

s1monw approved these changes Sep 25, 2018

View reviewed changes

dnhatn merged commit 5166dd0 into elastic:master Sep 25, 2018

dnhatn deleted the replica-msu branch September 25, 2018 12:08

dnhatn added the backport pending label Sep 25, 2018

dnhatn removed the backport pending label Sep 27, 2018

dnhatn added a commit that referenced this pull request Sep 27, 2018

Adjust bwc version for max_seq_no_of_updates

12d94e4

Relates #33967 Relates #33842

dnhatn mentioned this pull request Sep 29, 2018

Uses auto generated timestamp with soft-deletes #33656

Closed

jasontedor added v7.0.0 v6.5.0 and removed v6.5.0 v7.0.0 labels Sep 29, 2018

tomcallahan added the >non-issue label Sep 29, 2018

kcm pushed a commit that referenced this pull request Oct 30, 2018

Adjust bwc version for max_seq_no_of_updates

9ab84a0

Relates #33967 Relates #33842

kcm pushed a commit that referenced this pull request Oct 30, 2018

TEST: Add engine is closed as expected failure msg

69e2aef

This commit adds "engine is closed" as an expected failure message. This change is due to #33967 in which we might access a closed engine on promotion. Relates #33967

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Replicate max seq_no of updates to replicas #33967

Replicate max seq_no of updates to replicas #33967

Uh oh!

Conversation

dnhatn commented Sep 22, 2018

Uh oh!

elasticmachine commented Sep 22, 2018

Uh oh!

bleskes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Sep 22, 2018

Uh oh!

bleskes left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Sep 25, 2018

Uh oh!

bleskes left a comment

Choose a reason for hiding this comment

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

dnhatn commented Sep 25, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants