Skip to content

[CI] Failure in DiscoveryDisruptionIT.testClusterFormingWithASlowNode #33251

@cbuescher

Description

@cbuescher

Build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java8fips,nodes=virtual&&linux/268/console

This doesn't reproduce locally unfortunately:

./gradlew :server:integTest \
  -Dtests.seed=C95E04440D83BEC0 \
  -Dtests.class=org.elasticsearch.discovery.DiscoveryDisruptionIT \
  -Dtests.method="testClusterFormingWithASlowNode" \
  -Dtests.security.manager=true \
  -Dtests.locale=sr-Latn \
  -Dtests.timezone=America/Danmarkshavn \
  -Dcompiler.java=10 \
  -Druntime.java=8FIPS \
  -Djavax.net.ssl.keyStorePassword=password \
  -Djavax.net.ssl.trustStorePassword=password

Errors:

There are lots of NodeNotConnectedExceptions in the log that look like this:

18:20:38   1> [2018-08-29T16:19:38,289][DEBUG][o.e.t.t.MockTransportService] [node_t0] Exception while sending request, handler likely already notified due to timeout
18:20:38   1> org.elasticsearch.transport.NodeNotConnectedException: [node_t1][127.0.0.1:30101] connection already closed
18:20:38   1> 	at org.elasticsearch.transport.TcpTransport$NodeChannels.sendRequest(TcpTransport.java:410) ~[main/:?]
18:20:38   1> 	at org.elasticsearch.test.transport.MockTransportService.lambda$addFailToSendNoConnectRule$3(MockTransportService.java:223) ~[framework-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
18:20:38   1> 	at org.elasticsearch.test.transport.StubbableTransport$WrappedConnection.sendRequest(StubbableTransport.java:209) ~[framework-7.0.0-alpha1-SNAPSHOT.jar:7.0.0-alpha1-SNAPSHOT]
18:20:38   1> 	at org.elasticsearch.transport.TransportService.sendRequestInternal(TransportService.java:658) ~[main/:?]
18:20:38   1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:573) [main/:?]
18:20:38   1> 	at org.elasticsearch.transport.TransportService.sendRequest(TransportService.java:561) [main/:?]
18:20:38   1> 	at org.elasticsearch.discovery.zen.MasterFaultDetection$MasterPinger.run(MasterFaultDetection.java:225) [main/:?]
18:20:38   1> 	at org.elasticsearch.threadpool.ThreadPool$LoggingRunnable.run(ThreadPool.java:445) [main/:?]
18:20:38   1> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
18:20:38   1> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
18:20:38   1> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_181]
18:20:38   1> 	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_181]
18:20:38   1> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
18:20:38   1> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
18:20:38   1> 	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

After that masters leave and apparently cannot be reelected:

18:20:41   1> [2018-08-29T16:20:00,602][WARN ][o.e.t.d.TestZenDiscovery ] [node_t3] not enough master nodes discovered during pinging (found [[Candidate{node={node_t3}{cEU2AwQvT0GLx_oft7MEXQ}{7Ys6d08GQVimIOAHpOfGwQ}{127.0.0.1}

Finally:

ClusterBlockException[blocked by: [SERVICE_UNAVAILABLE/1/state not recovered / initialized];]
	at __randomizedtesting.SeedInfo.seed([C95E04440D83BEC0:256621FFA6166148]:0)
	at org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:166)
	at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:71)
	at org.elasticsearch.action.admin.indices.stats.TransportIndicesStatsAction.checkGlobalBlock(TransportIndicesStatsAction.java:48)
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.<init>(TransportBroadcastByNodeAction.java:248)
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:226)
	at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction.doExecute(TransportBroadcastByNodeAction.java:78)
	at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:143)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:119)
	at org.elasticsearch.action.support.TransportAction.execute(TransportAction.java:62)
	at org.elasticsearch.client.node.NodeClient.executeLocally(NodeClient.java:83)
	at org.elasticsearch.client.node.NodeClient.doExecute(NodeClient.java:72)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:388)
	at org.elasticsearch.client.FilterClient.doExecute(FilterClient.java:65)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:388)
	at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:377)
	at org.elasticsearch.client.support.AbstractClient$IndicesAdmin.execute(AbstractClient.java:1230)
	at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:45)
	at org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:52)
	at org.elasticsearch.test.ESIntegTestCase.lambda$assertSeqNos$7(ESIntegTestCase.java:2331)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:836)
	at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:822)
	at org.elasticsearch.test.ESIntegTestCase.assertSeqNos(ESIntegTestCase.java:2330)
	at org.elasticsearch.discovery.AbstractDisruptionTestCase.beforeIndexDeletion(AbstractDisruptionTestCase.java:112)
	at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:588)
	at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2186)
	at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:965)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
	at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
	at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
	at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
	at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
	at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
	at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
	at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
	at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
	at java.lang.Thread.run(Thread.java:748)

Metadata

Metadata

Assignees

Labels

:Distributed Coordination/Cluster CoordinationCluster formation and cluster state publication, including cluster membership and fault detection.>test-failureTriaged test failures from CI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions