-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.Anything around managing Lucene and the Translog in an open shard.>test-failureTriaged test failures from CITriaged test failures from CI
Description
A test failure occurred in https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+7.x+intake/99/console, however, the root cause was that one of the nodes in the cluster suffered a fatal exception:
[2019-02-14T11:46:31,144][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [node-1] fatal error in thread [elasticsearch[node-1][generic][T#2]], exiting
java.lang.AssertionError: unexpected failure while replicating translog entry: org.apache.lucene.store.AlreadyClosedException: this IndexWriter is closed
at org.elasticsearch.indices.recovery.RecoveryTarget.lambda$indexTranslogOperations$2(RecoveryTarget.java:362) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.action.ActionListener.completeWith(ActionListener.java:191) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.indices.recovery.RecoveryTarget.indexTranslogOperations(RecoveryTarget.java:333) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:521) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.indices.recovery.PeerRecoveryTargetService$TranslogOperationsRequestHandler.messageReceived(PeerRecoveryTargetService.java:480) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1076) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.1.0-SNAPSHOT.jar:7.1.0-SNAPSHOT]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
The test that was running when this happened was:
./gradlew :qa:smoke-test-multinode:integTestRunner \
-Dtests.seed=666DFE2D5892C30E \
-Dtests.class=org.elasticsearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT \
-Dtests.method="test {yaml=smoke_test_multinode/10_basic/cluster health basic test, wait for both nodes to join}" \
-Dtests.security.manager=true \
-Dtests.locale=mk \
-Dtests.timezone=Europe/Malta \
-Dcompiler.java=11 \
-Druntime.java=8
Since an almost identical test immediately before succeeded I doubt there is anything wrong with that test. (The "stash dump on failure" in the Jenkins log is very confusing too as it contains the result of the previous successful test.)
cluster_logs.zip contains the logs from the nodes in the test cluster that died with the fatal error.
Metadata
Metadata
Assignees
Labels
:Distributed Indexing/EngineAnything around managing Lucene and the Translog in an open shard.Anything around managing Lucene and the Translog in an open shard.>test-failureTriaged test failures from CITriaged test failures from CI