Skip to content

Conversation

@bleskes
Copy link
Contributor

@bleskes bleskes commented Dec 11, 2018

This PR add support to engine operations for resolving and verifying the sequence number and primary term of the last modification to a document before performing an operation. This is infrastructure to move our optimistic concurrency control API to use sequence numbers instead of internal versioning.

Relates #36148
Relates #10708

PS there are still discussions about the parameter naming. I'll rename these once we have settled on something.

@bleskes bleskes added >enhancement v7.0.0 :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. v6.6.0 labels Dec 11, 2018
@bleskes bleskes requested review from dnhatn, s1monw and ywelsch December 11, 2018 09:04
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I am not happy with the naming. I think cas should not be in there. Can we just name them expectedSeqId expectedPrimaryTerm for now? That's what for instance Integer.compareAndSet uses as it's parameter names.

final long seqNo;
final long term;
if (loadSeqNo) {
NumericDocValues seqNos = context.reader().getNumericDocValues(SeqNoFieldMapper.NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should clean this up in master once backported. can you put a comment in there?

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

Thanks @s1monw and @dnhatn. I incorporated your suggestion.

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

run gradle build tests 1

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

The build failure was regarding refresh listeners. I don't think it's a related but probably signals a concurrency bug. I tried to reproduce this without success (I run ~300 time). Capturing the info here in case it's needed.

15:42:40 FAILURE 18.0s J0 | RefreshListenersTests.testConcurrentRefresh <<< FAILURES!
15:42:40    > Throwable #1: java.lang.AssertionError
15:42:40    > 	at __randomizedtesting.SeedInfo.seed([2848E94A583B5A6E:EF0293D9D997F804]:0)
15:42:40    > 	at org.elasticsearch.index.shard.RefreshListenersTests.lambda$testConcurrentRefresh$8(RefreshListenersTests.java:282)
15:42:40    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:847)
15:42:40    > 	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:821)
15:42:40    > 	at org.elasticsearch.index.shard.RefreshListenersTests.testConcurrentRefresh(RefreshListenersTests.java:282)
./gradlew :server:test -Dtests.seed=2848E94A583B5A6E -Dtests.class=org.elasticsearch.index.shard.RefreshListenersTests -Dtests.method="testConcurrentRefresh" -Dtests.security.manager=true -Dtests.locale=pt-PT -Dtests.timezone=UCT -Dcompiler.java=11 -Druntime.java=8

@ywelsch
Copy link
Contributor

ywelsch commented Dec 12, 2018

The build failure was regarding refresh listeners. I don't think it's a related

Similar assertion recently failed on master (#24418 (comment)), so looks to be unrelated to your changes here

@bleskes
Copy link
Contributor Author

bleskes commented Dec 12, 2018

run gradle build tests 1

@bleskes bleskes merged commit f6b5d7e into elastic:master Dec 13, 2018
@bleskes bleskes deleted the cas_seq_no_engine branch December 13, 2018 07:08
bleskes added a commit to bleskes/elasticsearch that referenced this pull request Dec 13, 2018
…Engine (elastic#36467)

This commit add support to engine operations for resolving and verifying the sequence number and
primary term of the last modification to a document before performing an operation. This is
infrastructure to move our (optimistic concurrency control)[http://en.wikipedia.org/wiki/Optimistic_concurrency_control] API to use sequence numbers instead of internal versioning.

Relates elastic#36148
Relates elastic#10708
bleskes added a commit that referenced this pull request Dec 13, 2018
…Engine (#36467)

This commit add support to engine operations for resolving and verifying the sequence number and
primary term of the last modification to a document before performing an operation. This is
infrastructure to move our (optimistic concurrency control)[http://en.wikipedia.org/wiki/Optimistic_concurrency_control] API to use sequence numbers instead of internal versioning.

Relates #36148
Relates #10708
tlrx added a commit that referenced this pull request Jan 9, 2019
tlrx added a commit that referenced this pull request Jan 10, 2019
tlrx added a commit that referenced this pull request Jan 11, 2019
tlrx added a commit that referenced this pull request Jan 15, 2019
tlrx added a commit that referenced this pull request Jan 22, 2019
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jan 23, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates elastic#36467
Closes elastic#37684
dnhatn added a commit that referenced this pull request Jan 24, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
dnhatn added a commit that referenced this pull request Jan 28, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
tlrx added a commit that referenced this pull request Jan 29, 2019
tlrx added a commit that referenced this pull request Jan 29, 2019
tlrx added a commit that referenced this pull request Jan 30, 2019
talevy pushed a commit to talevy/elasticsearch that referenced this pull request Feb 15, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates elastic#36467
Closes elastic#37684
talevy added a commit that referenced this pull request Feb 16, 2019
The assertion `assertOpsOnPrimary` does not store seq_no and primary
term of successful deletes to the `lastOpSeqNo` and `lastOpTerm`. This
leads to failures of the subsequence CAS deletes or indexes with seq_no
and term. Moreover, this assertion trips a translog assertion because it
bumps the primary term of some operations but not the primary term of
the engine.

Relates #36467
Closes #37684
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Mar 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v6.6.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants