Skip to content

https?.Server.keepAliveTimeout introduced boundary condition that results in client-side socket hang-ups #20256

@ggoodman

Description

@ggoodman
  • Version: v8.11.1
  • Platform: Darwin C02PQ4SHFVH8 17.5.0 Darwin Kernel Version 17.5.0: Mon Mar 5 22:24:32 PST 2018; root:xnu-4570.51.1~1/RELEASE_X86_64 x86_64
  • Subsystem: http

With the introduction of #2534, there now appears to be a window of opportunity after the trailing edge of server.keepAliveTimeout whereby a client and server have an inconsistent view of the state of a keep-alive socket's connection.

In this situation, the server has closed the client's socket, however the client has not yet recognized this and attempts to re-use the persistent connection resulting in a read ECONNRESET.

Below is a reproduction. Notice that the interval of 5000ms coincides with the default value of keepAliveTimeout. This script may run for some time before producing the error as I don't know how to better time things to hit the window of inconsistent state.

const Http = require('http');

const server = Http.createServer((req, res) => {
    res.writeHead(204);
    res.end();
});

const agent = new Http.Agent({
    keepAlive: true,
    maxSockets: Infinity,
});

server.listen(0, '127.0.0.1', () => {
    const address = server.address();

    let requestCount = 0;

    const requestOnce = () => {
        const n = requestCount++;
        const clientReq = Http.get(
            {
                agent,
                hostname: address.address,
                path: '/',
                port: address.port,
            },
            clientRes => {
                clientRes.on('end', () => console.log(`Ended request ${n}`));
                clientRes.on('error', err => {
                    throw err;
                });
                clientRes.resume();
            }
        );

        clientReq.on('error', err => {
            throw err;
        });

        console.log(`Starting request ${n}`);
    };

    setInterval(requestOnce, 5000);
});

On my machine, this produced the following output:

% node node8-socket-timeout.js
Starting request 0
Ended request 0
Starting request 1
Ended request 1
Starting request 2
/path/to/node8-socket-timeout.js:31
        clientReq.on('error', err => { throw err; });
                                       ^

Error: read ECONNRESET
    at _errnoException (util.js:1022:11)
    at TCP.onread (net.js:615:25)

If the above is indeed the intended behaviour of this new feature, I think the community would benefit from a bit of a warning in the http / https docs to the effect that there is this boundary condition. Maybe users can be told that one of two things should be done:

  1. Servers should now deliberately set keepAliveTimeout to 0 and suffer the same memory inefficiencies as node <= 8; or
  2. Clients should implement retry logic specific to these boundary condition socket hang ups

In the 2nd case, I think there is quite a bit of room for doing it incorrectly, especially requests to services or endpoints that are not idempotent. From this perspective, I think it would be pretty helpful to have some guidance from Node Core Team on how to properly detect this specific class of error and how to work around this new behaviour.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedIssues that need assistance from volunteers or PRs that need help to proceed.httpIssues or PRs related to the http subsystem.httpsIssues or PRs related to the https subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions