CloseWhileRelocatingShardsIT.testCloseWhileRelocatingShards failure on master due to AssertionError in production code

Example CI links:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA=java11,ES_RUNTIME_JAVA=zulu8,nodes=immutable&&linux&&docker/272/console
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/2323/console

I believe this may be related to replicated closed indices, as this particular failure first appeared (according to build stats) on the `replicated-closed-indices` branch on Feb. 8.

The assertion appears to come from [this check](https://github.com/elastic/elasticsearch/blob/11b58eb4c0bbf0e478aa495e395386a717b0d41d/server/src/main/java/org/elasticsearch/index/engine/ReadOnlyEngine.java#L115-L118).

Reproduce line, does not reproduce locally:
```
./gradlew :server:integTest \
  -Dtests.seed=BB15E4FDA1CABDD9 \
  -Dtests.class=org.elasticsearch.indices.state.CloseWhileRelocatingShardsIT \
  -Dtests.method="testCloseWhileRelocatingShards" \
  -Dtests.security.manager=true \
  -Dtests.locale=de-DE \
  -Dtests.timezone=America/Lower_Princes \
  -Dcompiler.java=11 \
  -Druntime.java=8
```

Stack trace:
```
com.carrotsearch.randomizedtesting.UncaughtExceptionError: Captured an uncaught exception in thread: Thread[id=5128, name=elasticsearch[node_sd2][generic][T#1], state=RUNNABLE, group=TGRP-CloseWhileRelocatingShardsIT]
	at __randomizedtesting.SeedInfo.seed([B020647676D038FA:9FAB7AE3594E6E40]:0)
Caused by: java.lang.AssertionError: max seq. no. [-1] does not match [531]
	at __randomizedtesting.SeedInfo.seed([B020647676D038FA]:0)
	at org.elasticsearch.index.engine.ReadOnlyEngine.assertMaxSeqNoEqualsToGlobalCheckpoint(ReadOnlyEngine.java:142)
	at org.elasticsearch.index.engine.ReadOnlyEngine.<init>(ReadOnlyEngine.java:116)
	at org.elasticsearch.index.engine.NoOpEngine.<init>(NoOpEngine.java:40)
	at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1442)
	at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1395)
	at org.elasticsearch.index.shard.StoreRecovery.internalRecoverFromStore(StoreRecovery.java:424)
	at org.elasticsearch.index.shard.StoreRecovery.lambda$recoverFromStore$0(StoreRecovery.java:95)
	at org.elasticsearch.index.shard.StoreRecovery.executeRecovery(StoreRecovery.java:302)
	at org.elasticsearch.index.shard.StoreRecovery.recoverFromStore(StoreRecovery.java:93)
	at org.elasticsearch.index.shard.IndexShard.recoverFromStore(IndexShard.java:1681)
	at org.elasticsearch.index.shard.IndexShard.lambda$startRecovery$9(IndexShard.java:2318)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:681)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
```

I'm going to mute this test in master as it appears to be failing a few times per day and looks like a legitimate failure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CloseWhileRelocatingShardsIT.testCloseWhileRelocatingShards failure on master due to AssertionError in production code #39588

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CloseWhileRelocatingShardsIT.testCloseWhileRelocatingShards failure on master due to AssertionError in production code #39588

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions