https?.Server.keepAliveTimeout introduced boundary condition that results in client-side socket hang-ups

* **Version**: `v8.11.1`
* **Platform**: `Darwin C02PQ4SHFVH8 17.5.0 Darwin Kernel Version 17.5.0: Mon Mar  5 22:24:32 PST 2018; root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64`
* **Subsystem**: `http`

With the introduction of #2534, there now appears to be a window of opportunity after the trailing edge of `server.keepAliveTimeout` whereby a client and server have an inconsistent view of the state of a `keep-alive` socket's connection.

In this situation, the server has closed the client's socket, however the client has not yet recognized this and attempts to re-use the persistent connection resulting in a `read ECONNRESET`.

Below is a reproduction. Notice that the interval of `5000`ms coincides with the default value of `keepAliveTimeout`. This script may run for some time before producing the error as I don't know how to better time things to hit the window of inconsistent state.

```js
const Http = require('http');

const server = Http.createServer((req, res) => {
    res.writeHead(204);
    res.end();
});

const agent = new Http.Agent({
    keepAlive: true,
    maxSockets: Infinity,
});

server.listen(0, '127.0.0.1', () => {
    const address = server.address();

    let requestCount = 0;

    const requestOnce = () => {
        const n = requestCount++;
        const clientReq = Http.get(
            {
                agent,
                hostname: address.address,
                path: '/',
                port: address.port,
            },
            clientRes => {
                clientRes.on('end', () => console.log(`Ended request ${n}`));
                clientRes.on('error', err => {
                    throw err;
                });
                clientRes.resume();
            }
        );

        clientReq.on('error', err => {
            throw err;
        });

        console.log(`Starting request ${n}`);
    };

    setInterval(requestOnce, 5000);
});
```

On my machine, this produced the following output:

```sh
% node node8-socket-timeout.js
Starting request 0
Ended request 0
Starting request 1
Ended request 1
Starting request 2
/path/to/node8-socket-timeout.js:31
        clientReq.on('error', err => { throw err; });
                                       ^

Error: read ECONNRESET
    at _errnoException (util.js:1022:11)
    at TCP.onread (net.js:615:25)
```

If the above is indeed the intended behaviour of this new feature, I think the community would benefit from a bit of a warning in the `http` / `https` docs to the effect that there is this boundary condition. Maybe users can be told that one of two things should be done:

1. Servers should now deliberately set `keepAliveTimeout` to `0` and suffer the same memory inefficiencies as `node <= 8`; or 
2. Clients should implement retry logic specific to these boundary condition socket hang ups

In the 2nd case, I think there is quite a bit of room for doing it incorrectly, especially requests to services or endpoints that are not idempotent. From this perspective, I think it would be pretty helpful to have some guidance from Node Core Team on how to properly detect this specific class of error and how to work around this new behaviour.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

https?.Server.keepAliveTimeout introduced boundary condition that results in client-side socket hang-ups #20256

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

https?.Server.keepAliveTimeout introduced boundary condition that results in client-side socket hang-ups #20256

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions