Skip to content

Conversation

@pawankartik-elastic
Copy link
Contributor

Previously, _resolve/cluster would wait for a response from a remote as part of the connection strategy. If the remote were to be unresponsive, this API would wait until netty would terminate the connection with a handshake exception. The threshold for terminating the connection is 10s. This means that the API would wait for 10s before determining that the remote is unresponsive. After an internal discussion, this is now replaced with a fail fast strategy where a response is sent back to the user immediately rather than waiting for a connection termination.

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.
@pawankartik-elastic pawankartik-elastic added >bug auto-backport Automatically create backport pull requests when merged Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.16.0 :Search Foundations/Search Catch all for Search Foundations v9.0.0 v8.17.0 v8.18.0 labels Jan 3, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@elasticsearchmachine
Copy link
Collaborator

Hi @pawankartik-elastic, I've created a changelog YAML for you.

Copy link
Contributor

@quux00 quux00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work researching this to have it culminate in being a one-liner!

@pawankartik-elastic pawankartik-elastic merged commit d2d0636 into elastic:main Jan 3, 2025
16 checks passed
@pawankartik-elastic pawankartik-elastic deleted the pkar/resolve-cluster-hang-fix branch January 3, 2025 16:38
pawankartik-elastic added a commit to pawankartik-elastic/elasticsearch that referenced this pull request Jan 3, 2025
…astic#119516)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
pawankartik-elastic added a commit to pawankartik-elastic/elasticsearch that referenced this pull request Jan 3, 2025
…astic#119516)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
pawankartik-elastic added a commit to pawankartik-elastic/elasticsearch that referenced this pull request Jan 3, 2025
…astic#119516)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.16
8.17
8.x

elasticsearchmachine pushed a commit that referenced this pull request Jan 3, 2025
…19516) (#119526)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
elasticsearchmachine pushed a commit that referenced this pull request Jan 3, 2025
…19516) (#119528)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
elasticsearchmachine pushed a commit that referenced this pull request Jan 3, 2025
…19516) (#119527)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
sarog pushed a commit to portsbuild/elasticsearch that referenced this pull request Jan 22, 2025
…astic#119516) (elastic#119527)

* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >bug :Search Foundations/Search Catch all for Search Foundations Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v8.16.0 v8.17.0 v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants