-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Closed
Copy link
Labels
:Distributed Coordination/NetworkHttp and internode communication implementationsHttp and internode communication implementations>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding tests>test-failureTriaged test failures from CITriaged test failures from CI
Description
It seems to be a very rare (once a month) failure.
HEARTBEAT J0 PID([email protected]): 2017-05-02T20:05:14, stalled for 10.7s at: Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect
HEARTBEAT J0 PID([email protected]): 2017-05-02T20:05:24, stalled for 20.7s at: Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect
HEARTBEAT J0 PID([email protected]): 2017-05-02T20:05:34, stalled for 30.7s at: Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect
HEARTBEAT J0 PID([email protected]): 2017-05-02T20:05:44, stalled for 40.7s at: Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect
HEARTBEAT J0 PID([email protected]): 2017-05-02T20:05:54, stalled for 50.7s at: Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect
Suite: org.elasticsearch.transport.netty4.Netty4TransportMultiPortIntegrationIT
1> [2017-05-03T04:05:04,043][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [testThatTransportClientCanConnect]: before test
1> [2017-05-03T04:05:04,045][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [Netty4TransportMultiPortIntegrationIT#testThatTransportClientCanConnect]: setting up test
1> [2017-05-03T04:05:04,047][INFO ][o.e.t.InternalTestCluster] Setup InternalTestCluster [SUITE-CHILD_VM=[0]-CLUSTER_SEED=[-8360166899008743675]-HASH=[7A680CFBE5624]-cluster] with seed [8BFAB87FD99AD705] using [0] dedicated masters, [1] (data) nodes and [0] coord only nodes (min_master_nodes are [auto-managed])
1> [2017-05-03T04:05:04,052][INFO ][o.e.n.Node ] [node_s0] initializing ...
1> [2017-05-03T04:05:04,056][INFO ][o.e.e.NodeEnvironment ] [node_s0] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [368.4gb], net total_space [492gb], spins? [unknown], types [rootfs]
1> [2017-05-03T04:05:04,056][INFO ][o.e.e.NodeEnvironment ] [node_s0] heap size [491mb], compressed ordinary object pointers [true]
1> [2017-05-03T04:05:04,057][INFO ][o.e.n.Node ] [node_s0] node name [node_s0], node ID [r1bYfe5XTdesisdNlVnssA]
1> [2017-05-03T04:05:04,057][INFO ][o.e.n.Node ] [node_s0] version[6.0.0-alpha1-SNAPSHOT], pid[2364], build[6fcd24d/2017-05-02T19:35:56.330Z], OS[Linux/3.16.0-4-amd64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_121/25.121-b13]
1> [2017-05-03T04:05:04,057][WARN ][o.e.n.Node ] [node_s0] version [6.0.0-alpha1-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] no modules loaded
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] loaded plugin [org.elasticsearch.index.MockEngineFactoryPlugin]
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] loaded plugin [org.elasticsearch.node.NodeMocksPlugin]
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] loaded plugin [org.elasticsearch.test.ESIntegTestCase$TestSeedPlugin]
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] loaded plugin [org.elasticsearch.test.discovery.TestZenDiscovery$TestPlugin]
1> [2017-05-03T04:05:04,058][INFO ][o.e.p.PluginsService ] [node_s0] loaded plugin [org.elasticsearch.transport.Netty4Plugin]
1> [2017-05-03T04:05:04,083][INFO ][o.e.d.DiscoveryModule ] [node_s0] using discovery type [test-zen]
1> [2017-05-03T04:05:04,147][INFO ][o.e.n.Node ] [node_s0] initialized
1> [2017-05-03T04:05:04,148][INFO ][o.e.n.Node ] [node_s0] starting ...
1> [2017-05-03T04:05:04,159][INFO ][o.e.t.TransportService ] [node_s0] publish_address {127.0.0.1:9420}, bound_addresses {127.0.0.1:9420}
1> [2017-05-03T04:05:04,159][INFO ][o.e.t.TransportService ] [node_s0] profile [client1]: publish_address {127.0.0.7:4321}, bound_addresses {127.0.0.1:49842}
1> [2017-05-03T04:05:04,161][INFO ][o.e.t.d.MockZenPing ] [node_s0] pinging using mock zen ping
1> [2017-05-03T04:05:04,168][INFO ][o.e.c.s.MasterService ] [node_s0] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {node_s0}{r1bYfe5XTdesisdNlVnssA}{fG9GAd_uRBKQKSnUH8ubUw}{127.0.0.1}{127.0.0.1:9420}
1> [2017-05-03T04:05:04,168][INFO ][o.e.c.s.ClusterApplierService] [node_s0] new_master {node_s0}{r1bYfe5XTdesisdNlVnssA}{fG9GAd_uRBKQKSnUH8ubUw}{127.0.0.1}{127.0.0.1:9420}, reason: apply cluster state (from master [master {node_s0}{r1bYfe5XTdesisdNlVnssA}{fG9GAd_uRBKQKSnUH8ubUw}{127.0.0.1}{127.0.0.1:9420} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
1> [2017-05-03T04:05:04,170][INFO ][o.e.n.Node ] [node_s0] started
1> [2017-05-03T04:05:04,172][INFO ][o.e.p.PluginsService ] [transport_client_node_s0] no modules loaded
1> [2017-05-03T04:05:04,172][INFO ][o.e.p.PluginsService ] [transport_client_node_s0] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
1> [2017-05-03T04:05:04,172][INFO ][o.e.p.PluginsService ] [transport_client_node_s0] loaded plugin [org.elasticsearch.transport.Netty4Plugin]
1> [2017-05-03T04:05:04,185][INFO ][o.e.g.GatewayService ] [node_s0] recovered [0] indices into cluster_state
1> [2017-05-03T04:05:04,232][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] test using _default_ mappings: [{"_default_":{}}]
1> [2017-05-03T04:05:04,251][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [Netty4TransportMultiPortIntegrationIT#testThatTransportClientCanConnect]: all set up test
1> [2017-05-03T04:05:04,252][INFO ][o.e.p.PluginsService ] [_client_] no modules loaded
1> [2017-05-03T04:05:04,252][INFO ][o.e.p.PluginsService ] [_client_] loaded plugin [org.elasticsearch.transport.MockTcpTransportPlugin]
1> [2017-05-03T04:05:04,252][INFO ][o.e.p.PluginsService ] [_client_] loaded plugin [org.elasticsearch.transport.Netty4Plugin]
1> [2017-05-03T04:06:04,284][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [Netty4TransportMultiPortIntegrationIT#testThatTransportClientCanConnect]: cleaning up after test
1> [2017-05-03T04:06:04,298][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [Netty4TransportMultiPortIntegrationIT#testThatTransportClientCanConnect]: cleaned up after test
1> [2017-05-03T04:06:04,298][INFO ][o.e.t.n.Netty4TransportMultiPortIntegrationIT] [testThatTransportClientCanConnect]: after test
2> REPRODUCE WITH: gradle :modules:transport-netty4:integTestRunner -Dtests.seed=B4AE4510293F228B -Dtests.class=org.elasticsearch.transport.netty4.Netty4TransportMultiPortIntegrationIT -Dtests.method="testThatTransportClientCanConnect" -Dtests.security.manager=true -Dtests.locale=ca-ES -Dtests.timezone=Asia/Irkutsk
ERROR 60.3s | Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect <<< FAILURES!
> Throwable #1: NoNodeAvailableException[None of the configured nodes are available: [{#transport#-1}{9K55aiuUS9-vLxxhs1AXDg}{127.0.0.1}{127.0.0.1:49841}]]
> at __randomizedtesting.SeedInfo.seed([B4AE4510293F228B:50041AA92CBCDAF1]:0)
> at org.elasticsearch.client.transport.TransportClientNodesService.ensureNodesAreAvailable(TransportClientNodesService.java:347)
> at org.elasticsearch.client.transport.TransportClientNodesService.execute(TransportClientNodesService.java:245)
> at org.elasticsearch.client.transport.TransportProxyClient.execute(TransportProxyClient.java:59)
> at org.elasticsearch.client.transport.TransportClient.doExecute(TransportClient.java:357)
> at org.elasticsearch.client.support.AbstractClient.execute(AbstractClient.java:405)
> at org.elasticsearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:727)
> at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:77)
> at org.elasticsearch.action.ActionRequestBuilder.execute(ActionRequestBuilder.java:51)
> at org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:59)
> at org.elasticsearch.transport.netty4.Netty4TransportMultiPortIntegrationIT.testThatTransportClientCanConnect(Netty4TransportMultiPortIntegrationIT.java:80)
> at java.lang.Thread.run(Thread.java:745)
IGNOR/A 0.00s | Netty4TransportMultiPortIntegrationIT.testThatInfosAreExposed
> Assumption #1: 'network' test group is disabled (@Network())
1> [2017-05-03T04:06:04,312][INFO ][o.e.n.Node ] [node_s0] stopping ...
1> [2017-05-03T04:06:04,319][INFO ][o.e.n.Node ] [node_s0] stopped
1> [2017-05-03T04:06:04,320][INFO ][o.e.n.Node ] [node_s0] closing ...
1> [2017-05-03T04:06:04,322][INFO ][o.e.n.Node ] [node_s0] closed
2> NOTE: leaving temporary files on disk at: /var/lib/jenkins/workspace/elastic+elasticsearch+master+multijob-unix-compatibility/os/debian/modules/transport-netty4/build/testrun/integTestRunner/J0/temp/org.elasticsearch.transport.netty4.Netty4TransportMultiPortIntegrationIT_B4AE4510293F228B-001
2> May 02, 2017 8:06:04 PM com.carrotsearch.randomizedtesting.ThreadLeakControl checkThreadLeaks
2> WARNING: Will linger awaiting termination of 2 leaked thread(s).
2> NOTE: test params are: codec=Asserting(Lucene70): {}, docValues:{}, maxPointsInLeafNode=266, maxMBSortInHeap=7.273224578077655, sim=RandomSimilarity(queryNorm=true): {}, locale=ca-ES, timezone=Asia/Irkutsk
2> NOTE: Linux 3.16.0-4-amd64 amd64/Oracle Corporation 1.8.0_121 (64-bit)/cpus=4,threads=1,free=281663728,total=520093696
2> NOTE: All tests run in this JVM: [Netty4PipeliningDisabledIT, Netty4TransportMultiPortIntegrationIT]
When I ran it locally with the same seed it I noticed that the clien1 binds to port 49841, but in case of the failure it binds to the port 49842:
successful run:
1> [2017-05-03T04:38:21,503][INFO ][o.e.t.TransportService ] [node_s0] profile [client1]: publish_address {127.0.0.7:4321}, bound_addresses {127.0.0.1:49841}
failed run:
1> [2017-05-03T04:05:04,159][INFO ][o.e.t.TransportService ] [node_s0] profile [client1]: publish_address {127.0.0.7:4321}, bound_addresses {127.0.0.1:49842}
It seems that if the randomly selected port is not available, the client binds to the next available port and the test fails.
I was able to reproduce the issue locally by running the test with the seed while running another server on port 49841.
Metadata
Metadata
Assignees
Labels
:Distributed Coordination/NetworkHttp and internode communication implementationsHttp and internode communication implementations>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding tests>test-failureTriaged test failures from CITriaged test failures from CI