Skip to content

Conversation

@fcofdez
Copy link
Contributor

@fcofdez fcofdez commented Apr 23, 2021

This commit adds support for cancellation on TransportMasterNodeAction.
This is useful to provide cancellation for requests such as TransportGetMappingsAction

@fcofdez fcofdez added >enhancement :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. v8.0.0 Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.14.0 labels Apr 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but I think we should also check for cancellation while we're trying to route the request to the master in the first place. Specifically at the top of doStart(), and IMO we should also adjust the predicate passed into the ClusterStateObserver so that it runs doStart() on the first state update after cancellation.

@fcofdez
Copy link
Contributor Author

fcofdez commented Apr 23, 2021

Thanks for the review!, those are good suggestions. I added the checks to cover these cases too.

private void executeMasterOperation(Task task, Request request, ClusterState state,
ActionListener<Response> listener) throws Exception {
if (task instanceof CancellableTask && ((CancellableTask) task).isCancelled()) {
throw new CancellationException("Task was cancelled");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we exercise this in the tests any more? At least I removed it and the tests still passed several hundred iterations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 I've added a new test for this scenario in c1da23f

final ClusterState.Builder newStateBuilder = ClusterState.builder(stateWithBlock);

// Either unblock the cluster state or just do an unrelated cluster state change that will check
// if the task has been cancelled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@fcofdez fcofdez requested a review from DaveCTurner April 23, 2021 15:30
Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@fcofdez
Copy link
Contributor Author

fcofdez commented Apr 26, 2021

@elasticmachine update branch

@fcofdez fcofdez merged commit b4f1851 into elastic:master Apr 26, 2021
fcofdez added a commit to fcofdez/elasticsearch that referenced this pull request Apr 26, 2021
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request May 11, 2022
In elastic#72157 we made it so that cancelling the task that a
`TransportMasterNodeAction` was executing causes the action to fail with
a `java.util.concurrent.CancellationException`. This exception is
treated as a `500 Internal Server Error` which results in `WARN`-level
logs, but cancelling a task is normal and expected behaviour and should
not be logged (see elastic#73524).

This commit fixes this by using a `TaskCancelledException` instead,
since this exception maps to a `400 Bad Request` and generates no logs.
elasticsearchmachine pushed a commit that referenced this pull request May 11, 2022
In #72157 we made it so that cancelling the task that a
`TransportMasterNodeAction` was executing causes the action to fail with
a `java.util.concurrent.CancellationException`. This exception is
treated as a `500 Internal Server Error` which results in `WARN`-level
logs, but cancelling a task is normal and expected behaviour and should
not be logged (see #73524).

This commit fixes this by using a `TaskCancelledException` instead,
since this exception maps to a `400 Bad Request` and generates no logs.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request May 11, 2022
In elastic#72157 we made it so that cancelling the task that a
`TransportMasterNodeAction` was executing causes the action to fail with
a `java.util.concurrent.CancellationException`. This exception is
treated as a `500 Internal Server Error` which results in `WARN`-level
logs, but cancelling a task is normal and expected behaviour and should
not be logged (see elastic#73524).

This commit fixes this by using a `TaskCancelledException` instead,
since this exception maps to a `400 Bad Request` and generates no logs.
elasticsearchmachine pushed a commit that referenced this pull request May 11, 2022
In #72157 we made it so that cancelling the task that a
`TransportMasterNodeAction` was executing causes the action to fail with
a `java.util.concurrent.CancellationException`. This exception is
treated as a `500 Internal Server Error` which results in `WARN`-level
logs, but cancelling a task is normal and expected behaviour and should
not be logged (see #73524).

This commit fixes this by using a `TaskCancelledException` instead,
since this exception maps to a `400 Bad Request` and generates no logs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.14.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants