Skip to content

Conversation

@blink1073
Copy link
Member

@blink1073 blink1073 commented Nov 11, 2025

Replaces #1852

Please complete the following before merging:

  • Is the relevant DRIVERS ticket in the PR title?

data:
failCommands: ["isMaster","hello"]
closeConnection: true
blockConnection: true
Copy link
Member Author

@blink1073 blink1073 Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are discouraged from changing existing tests, these test still pass without the backpressure changes, and this avoids adding a new runOnRequirement.

https://spruce.mongodb.com/version/6914e629b2e3e30007b7c384/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

@BorisDog BorisDog requested review from papafe and removed request for papafe November 12, 2025 17:34
- **Backpressure-enabled** - The pool MUST add the error labels `SystemOverloadedError` and `RetryableError` to network
errors or network timeouts it encounters during the connection establishment or the `hello` message. These labels
are used by the
[server monitor](../server-discovery-and-monitoring/server-discovery-and-monitoring.md#error-handling-pseudocode) to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest [server monitor] -> [SDAM error handling]

"server monitor" is the title of another document.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


1. Two concurrent writes begin on application threads A and B.
2. The server restarts.
3. Thread A receives the first non-timeout network error, and the client marks the server Unknown, and clears the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this section need to change? We aren't changing the behavior of network errors when executing a command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted

@blink1073 blink1073 requested a review from a team as a code owner November 17, 2025 15:55
@blink1073 blink1073 requested review from JamesKovacs and removed request for a team November 17, 2025 15:55
Copy link
Contributor

@baileympearson baileympearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed some test issues with Steve on slack, here's some other comments I had. My POC in Node is passing locally with the changes Steve is planning to make; once I have updated tests I'll test my POC in CI and hopefully everything passes.

@blink1073
Copy link
Member Author

Note: I haven't yet updated the prose test implementation in pymongo. I'm working on that next.

Copy link
Contributor

@baileympearson baileympearson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of [Connections](#connection) (including available and in use) MUST NOT exceed **maxPoolSize**
- **Rate-limited:** A Pool MUST limit the number of [Connections](#connection) being
[established](#establishing-a-connection-internal-implementation) concurrently via the **maxConnecting**
[pool option](#connection-pool-options).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a rough pseudocode in the #### Establishing a Connection (Internal Implementation) section that we might consider updating. Do you think it's worth updating this to include attaching error labels?

to avoid clearing the pool. The pool MUST NOT add the backpressure error labels during an authentication step
after the `hello` message. For errors that the driver can distinguish as never occurring due to server overload,
such as DNS lookup failures, TLS related errors, or errors encountered establishing a connection to a socks5 proxy,
the driver MUST NOT clear the connection pool and MUST NOT mark the server Unknown.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we want to continue clearing the pool and marking the server unknown for definite non-overload errors, shouldn't this say "MUST" rather than "MUST NOT"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants