-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Use sequence numbers to identify out of order delivery in replicas & recovery #23543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
s1monw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass and added a higher level recommendation before I go and review it deeper
| Fields fields = reader.fields(); | ||
| if (fields != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty readers are not possible anymore?
| /** Utility class to resolve the Lucene doc ID, version, seqNo and primaryTerms for a given uid. */ | ||
| public class VersionsAndSeqNoResolver { | ||
|
|
||
| static final ConcurrentMap<Object, CloseableThreadLocal<PerThreadIDAndVersionSeqNoLookup>> lookupStates = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that this class has multiple purposes I think we should rethink the way it works. Today we have to resolve the document multiple times, first to lookup the version then to lookup the sequence ID. I think it's dangerous to do it this way since we are:
- we might use a different searcher / reader (it should work but I don't like it from an API perspective
- we duplicate code where it's not necessary
- we lookup the same ID in 2*N Segments which is costly
now that said I wonder if we can refactor stuff to acquire a searcher globally to gain a per-thread lookup state that is closeable (we can use try / with in delete and index and we use that state to find the doc ID and it's subreader (this might be simply PerThreadIDAndVersionSeqNoLookup) That way we hold on to the reader for the time being but have fast access and clear access patterns on the different values (version, seq ID). It will likely change the engine a bit but I think the semantics will be clear then.
|
superseded by #24060 |
Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need.
The change also refactors
InternalEngine.indexandInternalEngine.delete. The current implementations tries to share as much code as possible between the different execution paths (primary vs replica, versioned vs unversioned etc.) but the end result is hard to read and is complex to reason about. The PR proposes a slightly more verbose version but where the code flows are clearer (IMO) and rely on immutable variables which makes easier to reason about guarantees.The PR also beefed up all the versioning tests in
InternalEngineTestsas the current tests are not sufficient (and didn't expose some subtle but minor existing bugs).Relates to #10708