Skip to content

Conversation

@bleskes
Copy link
Contributor

@bleskes bleskes commented Mar 23, 2017

The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. As a follow up we intend to take certain parts of these method and extract them to another help methods to improve things even more. This will be done as a follow up.

To support the refactoring I have considerably beefed up the versioning tests.

This PR is a spin-off from #23543 , which made it clear this is needed.

@bleskes
Copy link
Contributor Author

bleskes commented Mar 23, 2017

@jasontedor This is the same code you already reviewed but without the seq no logic. Since you already LGTMed so I didn't ask for your review. Of course, feel free to review anyway if you want.

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. I have to admit I am not sure it improves the readability of the engine. It rather feels like it make it more complicated with more methods without clear naming.


this.versions = versions;
this.termsEnum = termsEnum;
Terms terms = fields.terms(UidFieldMapper.NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you recall why this null check on fields was here? is there a chance that this is called on an empty reader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. The class was introduced that way with no explanation. I thought that it had to do with BWC where we made transitions into how we store UIDs. I wonder when we can end up with an empty reader. I'll discuss this with @jpountz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked with Adrien and this is OK (and is something he wanted to do for a long time).

/** Return null if id is not found. */
public DocIdAndVersion lookup(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException {
public DocIdAndVersion lookupVersion(BytesRef id, Bits liveDocs, LeafReaderContext context) throws IOException {
assert context.reader().getCoreCacheKey().equals(readerKey);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get messages for the asserts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

/** Reused for iteration (when the term exists) */
private PostingsEnum docsEnum;

private final Object readerKey;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reader key is only used for asserts. can we make sure it's null if asserts are not enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good one.

assert incrementVersionLookup();
VersionValue versionValue = versionMap.getUnderLock(op.uid());
if (versionValue == null) {
assert incrementIndexVersionLookup();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code is supposed to run only under assertions. I'll add comments.

op.type(),
op.id(),
op.versionType().explainConflictForWrites(currentVersion, expectedVersion, deleted));
enum LuceneOpStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the enum values make no sense in the context of the name. I wonder if we should call it OperationAge or something like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name comes from the following line of thought - maybe it helps explain it (I'm fine with any other names) - operations on replicas (where this is relevant) are always stored in the translog. They are added to lucene only if the document version in lucene is older than the incoming one. The enum is supposed to reflect the status of the document in lucene. Does this help? any suggestion as to how to name it to reflect it only relates the lucene index?

: Optional.empty();
} catch (IllegalArgumentException | VersionConflictEngineException ex) {
resultOnVersionConflict = Optional.of(new IndexResult(ex, currentVersion, index.seqNo()));
} else if (canOptimizeAddDocument && mayHaveBeenIndexedBefore(index) == false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we fold this into an else { and start again with if in there with a comment that we are now on a replica? it would read cleaner

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

assert index.versionType().versionTypeForReplicationAndRecovery() == index.versionType() :
"resolving out of order delivery based on versioning but version type isn't fit for it. got ["
+ index.versionType() + "]";
final LuceneOpStatus luceneOpStatus = checkLuceneOpStatusBasedOnVersions(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the semantics of this method are weird. I pass in an operation and it returns OLDER if the given operation has a higher version? I would have expected the opposite. This semantics also make the rest of this method hard to read since it has to negate the return values in 2/3 of the cases. I think you should flip it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming the method and enum to lucene doc status? I'll change that and see if it makes more sense for you.

private IndexResult indexIntoLucene(Index index, long seqNo, long newVersion, boolean markDocAsCreated,
boolean useLuceneUpdateDocument)
throws IOException {
assertSequenceNumberBeforeIndexing(index.origin(), seqNo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only execute this if assertions are enabled?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. thx.

* non-document failure
*/
return new IndexResult(ex, currentVersion, index.seqNo());
return new IndexResult(ex, Versions.MATCH_ANY, index.seqNo());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh why did you change this to MATCH_ANY?... worth a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have that information anymore here, and in the case of failure, we don't use it anyway. I'll comment.


private boolean isForceUpdateDocument(Index index) {
boolean forceUpdateDocument;
private boolean mayHaveBeenIndexedBefore(Index index) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a javadoc for this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@jasontedor
Copy link
Member

This is the same code you already reviewed but without the seq no logic. Since you already LGTMed so I didn't ask for your review.

I didn't review it, I saw it while it was a work in progress. I have not LGTMed the previous PR, I would like to review this PR.

@jasontedor jasontedor self-requested a review March 27, 2017 21:09
@bleskes
Copy link
Contributor Author

bleskes commented Mar 28, 2017

I didn't review it, I saw it while it was a work in progress. I have not LGTMed the previous PR, I would like to review this PR.

@jasontedor I'm very sorry and have no idea what made me think you did. As said before, your review is more than welcome (and it seems needed as it didn't do it before).

@bleskes
Copy link
Contributor Author

bleskes commented Mar 29, 2017

@s1monw I pushed a commit that address most of your feedback.

re

I have to admit I am not sure it improves the readability of the engine. It rather feels like it make it more complicated with more methods without clear naming.

Fair enough. This is subjective. I think things will be clearer with the follow up change we intended to make (move some the long code into methods that return a struct). Maybe I should do this now within this change so we can see how it looks? @jasontedor indicated he would also prefer to see the end result rather than review this intermediate step (the diff is too big anyway).

@s1monw
Copy link
Contributor

s1monw commented Mar 30, 2017

@jasontedor indicated he would also prefer to see the end result rather than review this intermediate step (the diff is too big anyway).

++

@bleskes
Copy link
Contributor Author

bleskes commented Mar 30, 2017

@jasontedor @s1monw I pushed ahead and added helper methods. I think it looks much better but this is subjective. I will probably do another run and polish things more but I think it's ready for you. LMKWYT

@bleskes bleskes requested a review from s1monw March 30, 2017 15:50
Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly nit picks looks great

throw new IllegalArgumentException("reader misses the [" + VersionFieldMapper.NAME +
"] field");
}
boolean assertionsOn = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe:

Object readerKey = null;
assert (readerKey = reader.getCoreCacheKey()) != null;
this.readerKey = readerKey;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much better. Thanks.

}
}

private static final class IndexingPlan {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the name, maybe IndexingStrategy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

final LuceneDocStatus luceneOpStatus = checkLuceneDocStatusBasedOnVersions(index);
if (luceneOpStatus == LuceneDocStatus.NEWER_OR_EQUAL) {
plan = IndexingPlan.processButSkipLucene(
luceneOpStatus == LuceneDocStatus.NOT_FOUND, index.seqNo(), index.version());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

luceneOpStatus == LuceneDocStatus.NOT_FOUND is false here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! IntelliJ agrees too.

// unlike the primary, replicas don't really care to about creation status of documents
// this allows to ignore the case where a document was found in the live version maps in
// a delete state and return false for the created flag in favor of code simplicity
final LuceneDocStatus luceneOpStatus = checkLuceneDocStatusBasedOnVersions(index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if this returns LuceneDocStatus.NEWER_OR_EQUAL then the existing doc is newer or equal and not the given doc? that is very confusing, I mentioned this before I think. I'd never expect that. The opposite is intuitive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I flipped it around.

@bleskes
Copy link
Contributor Author

bleskes commented Mar 31, 2017

@s1monw I addressed your latest feedback (thx). Can you take another look please?

@bleskes bleskes requested a review from s1monw March 31, 2017 14:31
Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the iterations

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bleskes bleskes merged commit 75b4f40 into elastic:master Apr 5, 2017
@bleskes bleskes deleted the engine_clearer_flow branch April 5, 2017 12:43
@bleskes
Copy link
Contributor Author

bleskes commented Apr 5, 2017

Thank you @s1monw @jasontedor. This was a tough one.

@bleskes
Copy link
Contributor Author

bleskes commented Apr 5, 2017

PS. I will let this bake for a day or two before back porting.

bleskes added a commit to bleskes/elasticsearch that referenced this pull request Apr 9, 2017
The refactoring in elastic#23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's
wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts.

Note - once replicas use sequence numbers for out of order delivery, this logic goes away.
bleskes added a commit that referenced this pull request Apr 9, 2017
The refactoring in #23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts.

Note - once replicas use sequence numbers for out of order delivery, this logic goes away.
bleskes added a commit that referenced this pull request Apr 10, 2017
The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. The methods are broken up into smaller sub methods, albeit at the price of less code I reused.

To support the refactoring I have considerably beefed up the versioning tests.
bleskes added a commit that referenced this pull request Apr 10, 2017
The refactoring in #23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts.
@clintongormley clintongormley added :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Engine :Distributed Indexing/Distributed A catch all label for anything in the Distributed Indexing Area. Please avoid if you can. labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >non-issue v5.4.0 v6.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants