Skip to content

MockTransportService throws AssertionError: still open connections #34990

@DaveCTurner

Description

@DaveCTurner

This is a failure I have seen recently on the zen2 branch (quite rarely, and unable to reproduce on master). For instance on commit b01d321 I ran this in a loop:

./gradlew :server:integTest -Dtests.class=org.elasticsearch.action.admin.indices.stats.IndicesStatsBlocksIT -Dtests.iters=200 -Dtests.failfast=true

On the 7th time round one of the tests failed with the following stack trace:

  2> REPRODUCE WITH: ./gradlew :server:integTest -Dtests.seed=B5E24AD11854BA9B -Dtests.class=org.elasticsearch.action.admin.indices.stats.IndicesStatsBlocksIT -Dtests.method="testIndicesStatsWithBlocks {seed=[B5E24AD11854BA9B:99DFE9CCD7C698F7]}" -Dtests.security.manager=true -Dtests.locale=ig-NG -Dtests.timezone=Africa/Freetown -Dcompiler.java=11 -Druntime.java=11
FAILURE 1.40s | IndicesStatsBlocksIT.testIndicesStatsWithBlocks {seed=[B5E24AD11854BA9B:99DFE9CCD7C698F7]} <<< FAILURES!
   > Throwable #1: java.lang.AssertionError: still open connections: {{127.0.0.1:37253}{nNBmVQAAQACCxD28_____w}{127.0.0.1}{127.0.0.1:37253}=[org.elasticsearch.test.transport.StubbableTransport$WrappedConnection@702fbe9b]}
   >    at __randomizedtesting.SeedInfo.seed([B5E24AD11854BA9B:99DFE9CCD7C698F7]:0)
   >    at org.elasticsearch.test.transport.MockTransportService.doClose(MockTransportService.java:625)
   >    at org.elasticsearch.common.component.AbstractLifecycleComponent.close(AbstractLifecycleComponent.java:100)
   >    at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:103)
   >    at org.elasticsearch.core.internal.io.IOUtils.close(IOUtils.java:85)
   >    at org.elasticsearch.node.Node.close(Node.java:862)
   >    at org.elasticsearch.test.InternalTestCluster$NodeAndClient.closeNode(InternalTestCluster.java:916)
   >    at org.elasticsearch.test.InternalTestCluster$NodeAndClient.close(InternalTestCluster.java:993)
   >    at org.elasticsearch.core.internal.io.IOUtils.closeWhileHandlingException(IOUtils.java:145)
   >    at org.elasticsearch.test.InternalTestCluster.close(InternalTestCluster.java:810)
   >    at org.elasticsearch.test.ESIntegTestCase.afterInternal(ESIntegTestCase.java:587)
   >    at org.elasticsearch.test.ESIntegTestCase.cleanUpCluster(ESIntegTestCase.java:2195)
   >    at jdk.internal.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
   >    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   >    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   >    at java.base/java.lang.Thread.run(Thread.java:834)

This seems to occur when nodes are shutting down - particularly when the master shuts down then the remaining nodes will attempt to elect a new master, which first involves reconnecting to each node (not strictly necessary, see below). The open connection in question is one of these probe connections, since the remote node's name is just its transport address {127.0.0.1:37253} and not its real name.

// TODO if transportService is already connected to this address then skip the handshaking

As far as I can tell there is machinery in place to prevent this from happening so I don't think it's specifically a Zen2 issue; in Zen2 we create a bunch of new connections when a node shuts down, and Zen1 does not do this, which might be why this doesn't reproduce so easily on master.

@tbrooks8 my main question to you is whether you think this is a problem in the networking infrastructure or a problem with how we're using it in Zen2.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions