Skip to content

Conversation

@original-brownbear
Copy link
Contributor

We must not remove the snapshot from the initializing set
in the timeout getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes #64115

Marking as non-issue as this was not released in any version.

We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes elastic#64115
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Oct 25, 2020
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 25, 2020
It's confusing and slightly error prone (see elastic#64116) to handle the timeouts
via overrides but the priority via a field. This simplifies the code to to avoid future
issues and save over 100 LOC.
@original-brownbear
Copy link
Contributor Author

This is the short version of the fix, for 7.11 and master I'd go for #64117 in a follow-up to prevent this mistake going forward.

Copy link
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch!

@original-brownbear
Copy link
Contributor Author

Thanks Francisco!

@original-brownbear original-brownbear merged commit ebed2d2 into elastic:master Oct 26, 2020
@original-brownbear original-brownbear deleted the 64115 branch October 26, 2020 12:47
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 26, 2020
We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes elastic#64115
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 26, 2020
We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes elastic#64115
original-brownbear added a commit that referenced this pull request Oct 26, 2020
We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes #64115
original-brownbear added a commit that referenced this pull request Oct 26, 2020
We must not remove the snapshot from the initializing set
in the `timeout` getter. This was a plain oversight/mistake
and went unnoticed. It can lead to the removal of a valid
snapshot clone from the cluster state in rare circumstances
(e.g. when a node concurrently joins the cluster or a routing
change happens as it did in the linked test failure).

Closes #64115
original-brownbear added a commit that referenced this pull request Oct 28, 2020
It's confusing and slightly error prone (see #64116) to handle the timeouts
via overrides but the priority via a field. This simplifies the code to to avoid future
issues and save over 100 LOC.

Also this fixes a bug in `TransportVotingConfigExclusionsAction` where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Oct 28, 2020
It's confusing and slightly error prone (see elastic#64116) to handle the timeouts
via overrides but the priority via a field. This simplifies the code to to avoid future
issues and save over 100 LOC.

Also this fixes a bug in `TransportVotingConfigExclusionsAction` where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.
original-brownbear added a commit that referenced this pull request Oct 29, 2020
It's confusing and slightly error prone (see #64116) to handle the timeouts
via overrides but the priority via a field. This simplifies the code to to avoid future
issues and save over 100 LOC.

Also this fixes a bug in `TransportVotingConfigExclusionsAction` where trying to instantiate a time value with a negative time could throw and unexpected exception and as a result leak a listener.
@original-brownbear original-brownbear restored the 64115 branch December 6, 2020 18:59
@jakelandis jakelandis removed the v8.0.0 label Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >non-issue Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. v7.10.0 v7.11.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] Test Failure in CloneSnapshotIT.testBackToBackClonesForIndexNotInCluster

4 participants