-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Send ban parent per outstanding child connection #65443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pinging @elastic/es-distributed (Team:Distributed) |
|
@DaveCTurner I looked into your proposal to remove a ban once all associated child connections are disconnected. However, it isn't watertight as we can remove a ban before an in-flight child task start. Any suggestions are welcomed. |
Hmm wouldn't that mean the child task is going to respond on a connection that was closed before the task even started? If so we shouldn't even start the task. |
|
I should have been more clear. We do not prevent a child task from starting in this scenario:
|
|
I see. Could we indicate in the ban message that there are already in-flight tasks on connections c1 and c2, so it doesn't get removed while c2 is still open? |
|
On reflection that sounds hard, we have no way to refer to a specific connection that the other end can understand, except of course sending the ban over every relevant connection. |
|
Another issue is proxy connections. A proxy connection consists of two connections: the source node to the proxy node and the proxy node to the target node. While the former will never change, the latter can change anytime. I think we can't use the connection info to remove ban markers. |
|
Right, that was what I meant earlier about having the proxy also proxy a disconnect event. We can't describe a specific set of connections to a remote cluster, but we can say "the connection on which this message was received". |
|
@DaveCTurner I implemented a prototype that does not require heartbeats for the ban markers. In the prototype, we send a ban for each outstanding child connection and remove bans when all associated connections of that ban disconnect. I've updated this PR to send bans per child connection and will open a follow-up to associate connections to bans. This PR is ready for review again. Can you please take another look? Thank you! |
|
I also included (but reverted from this PR) the commit 143620d that tracks bans for each channel and removes them when their associated channels are closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think this still isn't quite watertight, because a Connection is a bundle of channels so messages to a single Connection are not fully ordered. I think it's possible that we could send a task T on one channel that takes a very long time to arrive, then on a different channel send the ban and receive a response, then clear the ban and receive a response, and only then does task T arrive at the target.
Edit: I see that I caused this confusion: I said "connection" in the earlier discussion. I meant TCP connection, i.e. channel.
|
@DaveCTurner Thanks for looking. That's a good point. I think we can make it tighter by sending a ban for every pair of Connection and TransportRequestOptions.Type, which is used to pick a TCP channel. WDYT? |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks for confirming, this LGTM then. Neatly done, I like this change 😄
server/src/main/java/org/elasticsearch/transport/TransportService.java
Outdated
Show resolved
Hide resolved
|
@DaveCTurner Thanks so much for reviewing and suggesting this direction :) |
|
@elasticmachine update branch |
This commit sends a parent-task ban for each connection so that we can reply on channel disconnect instead of node leave events to remove bans.
This commit sends a parent-task ban for each connection so that we can reply on channel disconnect instead of node leave events to remove bans. Backport #65443
Like elastic#56620, this change relies on channel disconnect instead of node leave events to remove parent-task ban markers. Relates elastic#65443 Relates elastic#56620
If a `NodeDisconnectedException` happens when sending a ban for a task then today we log a message at `INFO` or `WARN` indicating that the ban failed, but we don't indicate why. The message also uses a default `toString()` for an inner class which is unhelpful. Ban failures during disconnections are benign and somewhat expected, and task cancellation respects disconnections anyway (elastic#65443). There's not much the user can do about these messages either, and they can be confusing and draw attention away from the real problem. With this commit we log the failure messages at `DEBUG` on disconnections, and include the exception details. We also include the exception message for other kinds of failures, and we fix up a few cases where a useless default `toString()` implementation was used in log messages. Slightly relates elastic#72968 in that these messages tend to obscure a connectivity issue.
If a `NodeDisconnectedException` happens when sending a ban for a task then today we log a message at `INFO` or `WARN` indicating that the ban failed, but we don't indicate why. The message also uses a default `toString()` for an inner class which is unhelpful. Ban failures during disconnections are benign and somewhat expected, and task cancellation respects disconnections anyway (#65443). There's not much the user can do about these messages either, and they can be confusing and draw attention away from the real problem. With this commit we log the failure messages at `DEBUG` on disconnections, and include the exception details. We also include the exception message for other kinds of failures, and we fix up a few cases where a useless default `toString()` implementation was used in log messages. Slightly relates #72968 in that these messages tend to obscure a connectivity issue.
If a `NodeDisconnectedException` happens when sending a ban for a task then today we log a message at `INFO` or `WARN` indicating that the ban failed, but we don't indicate why. The message also uses a default `toString()` for an inner class which is unhelpful. Ban failures during disconnections are benign and somewhat expected, and task cancellation respects disconnections anyway (#65443). There's not much the user can do about these messages either, and they can be confusing and draw attention away from the real problem. With this commit we log the failure messages at `DEBUG` on disconnections, and include the exception details. We also include the exception message for other kinds of failures, and we fix up a few cases where a useless default `toString()` implementation was used in log messages. Slightly relates #72968 in that these messages tend to obscure a connectivity issue.
Few things that prevent us from supporting cross-clusters tasks cancellation:
A child task gets canceled if a node with the parent task leaves the cluster. This doesn't hold for a cross-clusters task as the parent task does not belong to the remote cluster. This issue was resolved in Cancel task and descendants on channel disconnects #56620, where we cancel child tasks on disconnect instead.
Ban action isn't registered on proxy nodes
Banned parent markers are removed when a node with the parent task leaves the cluster. This issue is similar to the first issue.
This commit tries to address the third issue by periodically (every 5 minutes) sending heartbeats for banned parent markers and removing markers that haven't received heartbeats for some time (30 minutes).This commit tries to address the third issue by sending a ban for each outstanding child connection and removing it when all associated connections disconnect.