Use sequence numbers to identify out of order delivery in replicas & recovery #23543

bleskes · 2017-03-11T17:44:40Z

Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need.

The change also refactors InternalEngine.index and InternalEngine.delete. The current implementations tries to share as much code as possible between the different execution paths (primary vs replica, versioned vs unversioned etc.) but the end result is hard to read and is complex to reason about. The PR proposes a slightly more verbose version but where the code flows are clearer (IMO) and rely on immutable variables which makes easier to reason about guarantees.

The PR also beefed up all the versioning tests in InternalEngineTests as the current tests are not sufficient (and didn't expose some subtle but minor existing bugs).

Relates to #10708

…oc values

…ict in deletes

s1monw

I did a first pass and added a higher level recommendation before I go and review it deeper

s1monw · 2017-03-20T14:59:44Z

core/src/main/java/org/elasticsearch/common/lucene/uid/PerThreadIDAndVersionSeqNoLookup.java

        Fields fields = reader.fields();
-        if (fields != null) {


empty readers are not possible anymore?

s1monw · 2017-03-20T15:22:33Z

core/src/main/java/org/elasticsearch/common/lucene/uid/VersionsAndSeqNoResolver.java

+/** Utility class to resolve the Lucene doc ID, version, seqNo and primaryTerms for a given uid. */
+public class VersionsAndSeqNoResolver {
+
+    static final ConcurrentMap<Object, CloseableThreadLocal<PerThreadIDAndVersionSeqNoLookup>> lookupStates =


now that this class has multiple purposes I think we should rethink the way it works. Today we have to resolve the document multiple times, first to lookup the version then to lookup the sequence ID. I think it's dangerous to do it this way since we are:

we might use a different searcher / reader (it should work but I don't like it from an API perspective

we duplicate code where it's not necessary

we lookup the same ID in 2*N Segments which is costly

now that said I wonder if we can refactor stuff to acquire a searcher globally to gain a per-thread lookup state that is closeable (we can use try / with in delete and index and we use that state to find the doc ID and it's subreader (this might be simply PerThreadIDAndVersionSeqNoLookup) That way we hold on to the reader for the time being but have fast access and clear access patterns on the different values (version, seq ID). It will likely change the engine a bit but I think the semantics will be clear then.

bleskes · 2017-04-12T06:49:11Z

superseded by #24060

bleskes added 15 commits March 7, 2017 15:20

initial version

f2211c1

fix testAppendWhileRecovering

baa7e51

lingering reference to SortedNumericDocValuesField

14b4b4b

currectly load sequence numbers if the index doesn't have the right d…

2b835f7

…oc values

Merge remote-tracking branch 'upstream/master' into seq_no_as_version

4a2f350

share more version look up code

6b9e3ca

remove no commit

cde619c

minor tweaks

d81bc8c

internal versioning test for primary. Fix found flag on version confl…

a27352d

…ict in deletes

fix delete error handling

7b58f1d

add primary external versioning test

b626e95

added testVersioningPromotedReplica

eca7872

add concurrency tests

0e4a2f6

clean duplicate tests

ca138f1

added testConcurrentGetAndSetOnPrimary

c0343ea

bleskes added :Engine >enhancement v6.0.0-alpha1 labels Mar 11, 2017

bleskes requested review from jasontedor and s1monw March 11, 2017 17:44

s1monw suggested changes Mar 20, 2017

View reviewed changes

bleskes mentioned this pull request Mar 23, 2017

Refactor InternalEngine's index/delete flow for better clarity #23711

Merged

bleskes closed this Apr 12, 2017

bleskes removed the v6.0.0-alpha1 label Apr 12, 2017

bleskes deleted the seq_no_as_version branch April 12, 2017 06:49

clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Engine labels Feb 13, 2018

clintongormley added :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Sequence IDs labels Feb 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use sequence numbers to identify out of order delivery in replicas & recovery #23543

Use sequence numbers to identify out of order delivery in replicas & recovery #23543

Uh oh!

bleskes commented Mar 11, 2017 •

edited

Loading

Uh oh!

s1monw left a comment

Uh oh!

s1monw Mar 20, 2017

Uh oh!

s1monw Mar 20, 2017

Uh oh!

bleskes commented Apr 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Use sequence numbers to identify out of order delivery in replicas & recovery #23543

Use sequence numbers to identify out of order delivery in replicas & recovery #23543

Uh oh!

Conversation

bleskes commented Mar 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

s1monw Mar 20, 2017

Choose a reason for hiding this comment

Uh oh!

s1monw Mar 20, 2017

Choose a reason for hiding this comment

Uh oh!

bleskes commented Apr 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bleskes commented Mar 11, 2017 •

edited

Loading