-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Refactor InternalEngine's index/delete flow for better clarity #23711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@jasontedor This is the same code you already reviewed but without the seq no logic. Since you already LGTMed so I didn't ask for your review. Of course, feel free to review anyway if you want. |
s1monw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments. I have to admit I am not sure it improves the readability of the engine. It rather feels like it make it more complicated with more methods without clear naming.
|
|
||
| this.versions = versions; | ||
| this.termsEnum = termsEnum; | ||
| Terms terms = fields.terms(UidFieldMapper.NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you recall why this null check on fields was here? is there a chance that this is called on an empty reader?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. The class was introduced that way with no explanation. I thought that it had to do with BWC where we made transitions into how we store UIDs. I wonder when we can end up with an empty reader. I'll discuss this with @jpountz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I double checked with Adrien and this is OK (and is something he wanted to do for a long time).
| /** Return null if id is not found. */ | ||
| public DocIdAndVersion lookup(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException { | ||
| public DocIdAndVersion lookupVersion(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException { | ||
| assert context.reader().getCoreCacheKey().equals(readerKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we get messages for the asserts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
| /** Reused for iteration (when the term exists) */ | ||
| private PostingsEnum docsEnum; | ||
|
|
||
| private final Object readerKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this reader key is only used for asserts. can we make sure it's null if asserts are not enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good one.
| assert incrementVersionLookup(); | ||
| VersionValue versionValue = versionMap.getUnderLock(op.uid()); | ||
| if (versionValue == null) { | ||
| assert incrementIndexVersionLookup(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this code is supposed to run only under assertions. I'll add comments.
| op.type(), | ||
| op.id(), | ||
| op.versionType().explainConflictForWrites(currentVersion, expectedVersion, deleted)); | ||
| enum LuceneOpStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the enum values make no sense in the context of the name. I wonder if we should call it OperationAge or something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name comes from the following line of thought - maybe it helps explain it (I'm fine with any other names) - operations on replicas (where this is relevant) are always stored in the translog. They are added to lucene only if the document version in lucene is older than the incoming one. The enum is supposed to reflect the status of the document in lucene. Does this help? any suggestion as to how to name it to reflect it only relates the lucene index?
| : Optional.empty(); | ||
| } catch (IllegalArgumentException | VersionConflictEngineException ex) { | ||
| resultOnVersionConflict = Optional.of(new IndexResult(ex, currentVersion, index.seqNo())); | ||
| } else if (canOptimizeAddDocument && mayHaveBeenIndexedBefore(index) == false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we fold this into an else { and start again with if in there with a comment that we are now on a replica? it would read cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
| assert index.versionType().versionTypeForReplicationAndRecovery() == index.versionType() : | ||
| "resolving out of order delivery based on versioning but version type isn't fit for it. got [" | ||
| + index.versionType() + "]"; | ||
| final LuceneOpStatus luceneOpStatus = checkLuceneOpStatusBasedOnVersions(index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the semantics of this method are weird. I pass in an operation and it returns OLDER if the given operation has a higher version? I would have expected the opposite. This semantics also make the rest of this method hard to read since it has to negate the return values in 2/3 of the cases. I think you should flip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming the method and enum to lucene doc status? I'll change that and see if it makes more sense for you.
| private IndexResult indexIntoLucene(Index index, long seqNo, long newVersion, boolean markDocAsCreated, | ||
| boolean useLuceneUpdateDocument) | ||
| throws IOException { | ||
| assertSequenceNumberBeforeIndexing(index.origin(), seqNo); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only execute this if assertions are enabled?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. thx.
| * non-document failure | ||
| */ | ||
| return new IndexResult(ex, currentVersion, index.seqNo()); | ||
| return new IndexResult(ex, Versions.MATCH_ANY, index.seqNo()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh why did you change this to MATCH_ANY?... worth a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have that information anymore here, and in the case of failure, we don't use it anyway. I'll comment.
|
|
||
| private boolean isForceUpdateDocument(Index index) { | ||
| boolean forceUpdateDocument; | ||
| private boolean mayHaveBeenIndexedBefore(Index index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a javadoc for this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
I didn't review it, I saw it while it was a work in progress. I have not LGTMed the previous PR, I would like to review this PR. |
@jasontedor I'm very sorry and have no idea what made me think you did. As said before, your review is more than welcome (and it seems needed as it didn't do it before). |
|
@s1monw I pushed a commit that address most of your feedback. re
Fair enough. This is subjective. I think things will be clearer with the follow up change we intended to make (move some the long code into methods that return a struct). Maybe I should do this now within this change so we can see how it looks? @jasontedor indicated he would also prefer to see the end result rather than review this intermediate step (the diff is too big anyway). |
++ |
|
@jasontedor @s1monw I pushed ahead and added helper methods. I think it looks much better but this is subjective. I will probably do another run and polish things more but I think it's ready for you. LMKWYT |
s1monw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly nit picks looks great
| throw new IllegalArgumentException("reader misses the [" + VersionFieldMapper.NAME + | ||
| "] field"); | ||
| } | ||
| boolean assertionsOn = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe:
Object readerKey = null;
assert (readerKey = reader.getCoreCacheKey()) != null;
this.readerKey = readerKey;There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much better. Thanks.
| } | ||
| } | ||
|
|
||
| private static final class IndexingPlan { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the name, maybe IndexingStrategy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
| final LuceneDocStatus luceneOpStatus = checkLuceneDocStatusBasedOnVersions(index); | ||
| if (luceneOpStatus == LuceneDocStatus.NEWER_OR_EQUAL) { | ||
| plan = IndexingPlan.processButSkipLucene( | ||
| luceneOpStatus == LuceneDocStatus.NOT_FOUND, index.seqNo(), index.version()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
luceneOpStatus == LuceneDocStatus.NOT_FOUND is false here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! IntelliJ agrees too.
| // unlike the primary, replicas don't really care to about creation status of documents | ||
| // this allows to ignore the case where a document was found in the live version maps in | ||
| // a delete state and return false for the created flag in favor of code simplicity | ||
| final LuceneDocStatus luceneOpStatus = checkLuceneDocStatusBasedOnVersions(index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if this returns LuceneDocStatus.NEWER_OR_EQUAL then the existing doc is newer or equal and not the given doc? that is very confusing, I mentioned this before I think. I'd never expect that. The opposite is intuitive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I flipped it around.
|
@s1monw I addressed your latest feedback (thx). Can you take another look please? |
s1monw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks for the iterations
jasontedor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
|
Thank you @s1monw @jasontedor. This was a tough one. |
|
PS. I will let this bake for a day or two before back porting. |
The refactoring in elastic#23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts. Note - once replicas use sequence numbers for out of order delivery, this logic goes away.
The refactoring in #23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts. Note - once replicas use sequence numbers for out of order delivery, this logic goes away.
The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. The methods are broken up into smaller sub methods, albeit at the price of less code I reused. To support the refactoring I have considerably beefed up the versioning tests.
The refactoring in #23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts.
The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. As a follow up we intend to take certain parts of these method and extract them to another help methods to improve things even more. This will be done as a follow up.
To support the refactoring I have considerably beefed up the versioning tests.
This PR is a spin-off from #23543 , which made it clear this is needed.