Periodically retry persistent task allocation

Currently the persistent tasks framework attempts to allocate unallocated persistent tasks in the following situations:

* Persistent tasks are changed
* A node joins or leaves the cluster
* The routing table is changed
* Custom metadata in the cluster state is changed
* A new master node is elected

When an ML node fails we need to reallocate ML jobs according to their memory requirements, taking into account the memory requirements of other open ML jobs.  The "A node joins or leaves the cluster" triggers an attempt to do this, but the master node doing the reallocation may not have all the up-to-date memory usage statistics needed to make sensible reallocation decisions.  One way to solve this problem is to defer the allocation - return `null` in response to the call to the `getAssignment()` function that the failure immediately causes and instead trigger an asynchronous request to gather the necessary information.  The problem comes when the asynchronous request returns with the necessary information - at this point we need to try to reallocate the persistent tasks whose allocation was deferred again.

We discussed this in the distributed area weekly meeting and decided that the simplest and safest way to achieve this would be to have persistent tasks recheck allocations periodically, say every 30 seconds.

An alternative would be to add an endpoint to allow clients to request a recheck of unallocated persistent tasks.  But this would run the risk that a client that called the endpoint too often could cause an excessive amount of rechecking.

The proposed change is therefore:

* Add a new dynamic cluster setting `cluster.persistent_tasks.allocation.recheck_interval` (default 30s) to control how frequently to recheck allocation of unallocated persistent tasks.
* Change `PersistentTasksClusterService` so that the loop currently in `shouldReassignPersistentTasks()` is factored out into a new method and is also run by the timer callback, and if it returns `true` then a cluster state update is triggered that calls `PersistentTasksClusterService.reassignTasks()` to change the state.
* `PersistentTasksClusterService.shouldReassignPersistentTasks()` will reset the timer so that if some cluster state update triggers an allocation check then the timer doesn't do another one shortly afterwards.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Periodically retry persistent task allocation #35792

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Periodically retry persistent task allocation #35792

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions