Fix support for infinite `?master_timeout` #107050

DaveCTurner · 2024-04-03T12:16:40Z

Specifying ?master_timeout=-1 on an API which performs a cluster state
update means that the cluster state update task will never time out
while waiting in the pending tasks queue. However this parameter is also
re-used in a few places where a timeout of -1 means something else,
typically to timeout immediately. This commit fixes those places so that
?master_timeout=-1 consistently means to wait forever.

Specifying `?master_timeout=-1` on an API which performs a cluster state update means that the cluster state update task will never time out while waiting in the pending tasks queue. However this parameter is also re-used in a few places where a timeout of `-1` means something else, typically to timeout immediately. This commit fixes those places so that `?master_timeout=-1` consistently means to wait forever.

github-actions · 2024-04-03T12:16:54Z

Documentation preview:

✨ Changed pages

elasticsearchmachine · 2024-04-03T12:17:04Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2024-04-03T12:17:05Z

Hi @DaveCTurner, I've created a changelog YAML for you.

volodk85 · 2024-04-09T22:40:28Z

.../org/elasticsearch/action/admin/cluster/snapshots/status/TransportSnapshotsStatusAction.java

                TransportNodesSnapshotsStatus.TYPE,
                new TransportNodesSnapshotsStatus.Request(nodesIds.toArray(Strings.EMPTY_ARRAY)).snapshots(snapshots)
-                    .timeout(request.masterNodeTimeout()),
+                    .timeout(request.masterNodeTimeout().millis() < 0 ? null : request.masterNodeTimeout()),


Shouldn't we compare with -1 explicitly here and everywhere down below?
Since user can specify any negative number in theory. Or doc should say that any negative number works as infinite timeout (which sounds confusing)

We only accept -1 when parsing a time value:

elasticsearch/libs/core/src/main/java/org/elasticsearch/core/TimeValue.java

Lines 375 to 384 in 0699c93

if (value < -1) {

// -1 is magic, but reject any other negative values

throw new IllegalArgumentException(

"failed to parse setting ["

+ settingName

+ "] with value ["

+ initialInput

+ "] as a time value: negative durations are not supported"

);

}

However the convention elsewhere in the code seems to be to compare .millis() < 0. I think we could make this neater for sure, but that's a job for another day. The docs definitely don't need to mention any lenience in this area.

volodk85

LGTM

Specifying `?master_timeout=-1` on an API which performs a cluster state update means that the cluster state update task will never time out while waiting in the pending tasks queue. However this parameter is also re-used in a few places where a timeout of `-1` means something else, typically to timeout immediately. This commit fixes those places so that `?master_timeout=-1` consistently means to wait forever.

This test doesn't fail anymore, I've run it 1000 times locally. This test got introduced in #107050, and I believe the test got fixed in #107675. Unfortunately, the got muted before #107675 got merged, so I can't confirm that #107675 fixed the test on CI.

This test doesn't fail anymore, I've run it 1000 times locally. This test got introduced in #107050, and I believe the test got fixed in #107675. Unfortunately, the got muted before #107675 got merged, so I can't confirm that PR actually fixed the test on CI.

DaveCTurner added >bug :Distributed Coordination/Cluster Coordination Cluster formation and cluster state publication, including cluster membership and fault detection. v8.14.0 labels Apr 3, 2024

DaveCTurner requested review from henningandersen and volodk85 April 3, 2024 12:16

elasticsearchmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Apr 3, 2024

Update docs/changelog/107050.yaml

c3d5387

DaveCTurner added 3 commits April 3, 2024 15:30

Merge branch 'main' into 2024/04/03/master-node-timeout

a4ea7a8

Reword

ca81553

Merge branch 'main' into 2024/04/03/master-node-timeout

545243e

volodk85 reviewed Apr 9, 2024

View reviewed changes

DaveCTurner requested a review from volodk85 April 10, 2024 05:58

volodk85 approved these changes Apr 10, 2024

View reviewed changes

DaveCTurner merged commit ccbb5ba into elastic:main Apr 10, 2024

DaveCTurner deleted the 2024/04/03/master-node-timeout branch April 10, 2024 17:32

idegtiarenko mentioned this pull request Apr 12, 2024

[CI] SnapshotStatusApisIT testInfiniteTimeout failing #107405

Closed

arteam mentioned this pull request May 2, 2024

Unmute SnapshotStatusApisIT#testInfiniteTimeout #108178

Merged

DaveCTurner restored the 2024/04/03/master-node-timeout branch June 17, 2024 06:17

DaveCTurner deleted the 2024/04/03/master-node-timeout branch July 30, 2024 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix support for infinite `?master_timeout` #107050

Fix support for infinite `?master_timeout` #107050

Uh oh!

DaveCTurner commented Apr 3, 2024

Uh oh!

github-actions bot commented Apr 3, 2024

Uh oh!

elasticsearchmachine commented Apr 3, 2024

Uh oh!

elasticsearchmachine commented Apr 3, 2024

Uh oh!

volodk85 Apr 9, 2024

Uh oh!

DaveCTurner Apr 10, 2024

Uh oh!

volodk85 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	if (value < -1) {
	// -1 is magic, but reject any other negative values
	throw new IllegalArgumentException(
	"failed to parse setting ["
	+ settingName
	+ "] with value ["
	+ initialInput
	+ "] as a time value: negative durations are not supported"
	);
	}

Fix support for infinite ?master_timeout #107050

Fix support for infinite ?master_timeout #107050

Uh oh!

Conversation

DaveCTurner commented Apr 3, 2024

Uh oh!

github-actions bot commented Apr 3, 2024

Uh oh!

elasticsearchmachine commented Apr 3, 2024

Uh oh!

elasticsearchmachine commented Apr 3, 2024

Uh oh!

volodk85 Apr 9, 2024

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

volodk85 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix support for infinite `?master_timeout` #107050

Fix support for infinite `?master_timeout` #107050