Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets #26764

original-brownbear · 2017-09-23T22:51:10Z

I think I tracked down #26701 here:

The problem is that we're using quite a few connections/sockets here, so with some (relatively low) probability this line:

 socket.connect(address, Math.toIntExact(connectTimeout.millis()))

will throw this exception because of running out of ports for the connecting sockets locally.

Eventually, this exception is just silently ignored by only traceing it in org.elasticsearch.common.util.concurrent.AbstractRunnable#onFailure:

 public void onFailure(Exception e) {
                if (e instanceof ConnectTransportException || e instanceof AlreadyClosedException) {
                    // can't connect to the node - this is more common path!
                    logger.trace(
                        (Supplier<?>) () -> new ParameterizedMessage(
                            "[{}] failed to ping {}", pingingRound.id(), node), e);

https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/UnicastZenPing.java#L520

=> which leads to pings silently not happening => counts are off.
I ran the test in a loop to reproduce this and upped the log level for that statement and missed pings like in the issue always coincided with the exception from the screenshot.

I fixed this by fixing two types of spots where we were leaking local ports/sockets (in the form of TIME_WAIT).

In one spot, we were setting TCP_REUSE_ADDRESS after a bind call which will not work on all systems
In the other changed spots we were leaking TIME_WAIT sockets when shutting down TcpTransport on a FIN instead of RST after the other side went down for sure. This was leaking so many TIME_WAIT clients that we eventually ran out of local ports and saw above Exception in tests.

Closes #26701

…connectToChannels

jasontedor · 2017-09-24T12:37:02Z

I don’t think we should make this change in mock TCP transport. It’s not realistic that we retry connects like that.

original-brownbear · 2017-09-24T13:14:40Z

@jasontedor fair point, but in production code, this kind of thing would just get retried further up the stack right?

=> My reasoning was this:

I guess we don't want to add retries like this into the production code
We cannot avoid this bind exception, it's just something that'll happen from time to time
Putting retries in the test makes it a lot harder to maintain and makes the test less realistic, unless we refactor the test to the degree that allows us to only retry on this kind of ConnectException
Making MockSocket retry here, makes it less realistic when under non-ideal conditions where network resources are under pressure, but makes it more realistic in the sense of reproducing the ideal "network works fine" case?

DaveCTurner · 2017-09-24T14:09:16Z

If you're not binding the outgoing socket, is a port collision possible? I didn't think you got port collisions in the kernel without a bind(). Willing to be corrected and not sure what's going on above the kernel level here, just curious.

s1monw · 2017-09-24T20:30:17Z

I also don't think we are having a port collision here. this is the code that connects to a server ie. client code. @DaveCTurner that's why we don't bind there. What happens here likely is that we are exhausting the our resources here and don't have any ports left. There might be tons of connections in TIME_WAIT state here which is good though. I am just wondering why we create so many connections. It's also independent of the transport impl I think so if we fix it we also need to fix in in other places or in TCPTransport.

That said, I don't think we should retry in such a way. I am more curious why we get into this state?

…Transport

original-brownbear · 2017-09-25T08:08:44Z

@s1monw tracked down where we leak a lot of TIME_WAITs, it's in org.elasticsearch.transport.MockTcpTransport.MockChannel#accept.

Activating linger with timeout 0 (like it's now done in the PR) obv. fixes the situation, do you see any downside of doing that in the MockTcpTransport.
Also, we were potentially (not on all systems) wasting a port in the other spot I changed because we were binding before setting SO_REUSEADDR => changed it to binding after the setting.

s1monw · 2017-09-25T10:22:14Z

test/framework/src/main/java/org/elasticsearch/transport/MockTcpTransport.java

                MockChannel incomingChannel = null;
                try {
                    configureSocket(incomingSocket);
+                    incomingSocket.setSoLinger(true, 0);


I am hesitating to set a 0 timeout for SO_LINGER The reason is that the abnormal termination of a socket that causes it to move into TIME_WAIT can be caused due to an exception ie. during request parsing on the server then the server closes the connection and in this case we should really stay in TIME_WAIT. The other case where the server closes the connection instead of the client is when we shutdown and then we can set our socket into SO_LINGER=0 since it's the right thing todo in this situation. I actually think we should do this in a broader scope. There is for instance a method TcpTransport#closeChannels(List<Channels> channels, boolean blocking) I wonder if we can piggyback on this boolean and set SO_LINGER to true,0 if we close our transport. It might even be a better idea to add another boolean boolean closingTransport in this case we can do the right thing also in other transport impls?

@s1monw

It might even be a better idea to add another boolean boolean closingTransport in this case we can do the right thing also in other transport impls?

Makes sense to me to do the additional boolean. My reasoning would be that the blocking parameter currently seems to be more about the way handle Netty's internal threading and not any low-level Socket behavior/setting.

@s1monw done, added another parameter and only made it RST on org.elasticsearch.transport.TcpTransport#doStop and the actual close call of TcpTransport leaving the failure handlers and such untouched.
This still (after removing the SO_LINGER setting I added) resolves the issue just fine, tests still don't leak any TIME_WAITs.
Hope this is what you had in mind :)

s1monw · 2017-09-25T10:22:31Z

test/framework/src/main/java/org/elasticsearch/transport/MockTcpTransport.java

        if (tcpReceiveBufferSize.getBytes() > 0) {
            socket.setReceiveBufferSize(tcpReceiveBufferSize.bytesAsInt());
        }
+        socket.bind(address);


makes sense ++

…seChannels

s1monw

this LGTM @tbrooks8 please also take a look if this makes sense to you as well. @original-brownbear can you update the description for this PR please. Thanks for working on this.

original-brownbear · 2017-09-26T14:07:18Z

@s1monw description updated :)

Tim-Brooks

LGTM

original-brownbear · 2017-09-26T15:29:52Z

@tbrooks8 thanks, should I squash this?

Tim-Brooks · 2017-09-26T19:30:05Z

You can if you wanted. But we often just use the GitHub "Squash and merge" button.

original-brownbear · 2017-09-26T19:57:41Z

@tbrooks8 ah ok, doing that then. Thanks!

…ME_WAIT Sockets (#26764) #26701 Added option to RST instead of FIN to TcpTransport#closeChannels

s1monw · 2017-09-27T19:02:55Z

@original-brownbear I backported it to 6.x

This is a follow up to elastic#26764. That commit set SO_LINGER to 0 in order to fix a scenario where we were running out of resources during CI. We are primarily interested in setting this to 0 when stopping the tranport. Allowing TIMED_WAIT is standard for other failure scenarios during normal operation. Unfortunately this commit set SO_LINGER to 0 every time we close NodeChannels. NodeChannels can be closed in case of an exception or other failures (such as parsing a response). We want to only disable linger when actually shutting down.

This is a follow up to #26764. That commit set SO_LINGER to 0 in order to fix a scenario where we were running out of resources during CI. We are primarily interested in setting this to 0 when stopping the tranport. Allowing TIMED_WAIT is standard for other failure scenarios during normal operation. Unfortunately this commit set SO_LINGER to 0 every time we close NodeChannels. NodeChannels can be closed in case of an exception or other failures (such as parsing a response). We want to only disable linger when actually shutting down.

elastic#26701 retry in org.elasticsearch.transport.MockTcpTransport#…

feb8f30

…connectToChannels

original-brownbear mentioned this pull request Sep 24, 2017

org.elasticsearch.discovery.zen.UnicastZenPingTests testSimplePings failure #26701

Closed

elastic#27071 Stop lingering on closed client connections in MockTcp…

f79f008

…Transport

s1monw reviewed Sep 25, 2017

View reviewed changes

elastic#26701 Added option to RST instead of FIN to TcpTransport#clo…

1ff8d6e

…seChannels

s1monw approved these changes Sep 26, 2017

View reviewed changes

original-brownbear changed the title ~~#26701 retry in org.elasticsearch.transport.MockTcpTransport#connect…~~ #26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets Sep 26, 2017

Tim-Brooks self-requested a review September 26, 2017 14:53

Tim-Brooks approved these changes Sep 26, 2017

View reviewed changes

original-brownbear merged commit af06231 into elastic:master Sep 26, 2017

original-brownbear deleted the 26701 branch September 26, 2017 19:58

s1monw pushed a commit that referenced this pull request Sep 27, 2017

#26701 Close TcpTransport on RST in some Spots to Prevent Leaking TI…

a7c3521

…ME_WAIT Sockets (#26764) #26701 Added option to RST instead of FIN to TcpTransport#closeChannels

s1monw added :Distributed Coordination/Network Http and internode communication implementations >bug v6.1.0 v7.0.0 labels Sep 27, 2017

clintongormley changed the title ~~#26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets~~ Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets Oct 2, 2017

Tim-Brooks mentioned this pull request Oct 3, 2017

Do not set SO_LINGER to 0 when not shutting down #26871

Merged

original-brownbear mentioned this pull request Jul 5, 2018

Connect timeout in SimpleNetty4TransportTests testAcceptedChannelCount #30876

Closed

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets #26764

Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets #26764

Uh oh!

Conversation

original-brownbear commented Sep 23, 2017 • edited by clintongormley Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasontedor commented Sep 24, 2017

Uh oh!

original-brownbear commented Sep 24, 2017

Uh oh!

DaveCTurner commented Sep 24, 2017

Uh oh!

s1monw commented Sep 24, 2017

Uh oh!

original-brownbear commented Sep 25, 2017

Uh oh!

s1monw Sep 25, 2017

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

original-brownbear Sep 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

s1monw Sep 25, 2017

Choose a reason for hiding this comment

Uh oh!

s1monw left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Sep 26, 2017

Uh oh!

Tim-Brooks left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Sep 26, 2017

Uh oh!

Tim-Brooks commented Sep 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

original-brownbear commented Sep 26, 2017

Uh oh!

s1monw commented Sep 27, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

original-brownbear commented Sep 23, 2017 •

edited by clintongormley

Loading

original-brownbear Sep 25, 2017 •

edited

Loading

original-brownbear Sep 25, 2017 •

edited

Loading

Tim-Brooks commented Sep 26, 2017 •

edited

Loading