Skip to content

MasterService does not complete all tasks on shutdown #94930

@DaveCTurner

Description

@DaveCTurner

Today when the MasterService shuts down, it fails waiting tasks but does not necessarily fail the ongoing batch of tasks. For instance, we just drop the batch on the floor here:

if (lifecycle.started() == false) {
logger.debug("processing [{}]: ignoring, master service not started", summary);
listener.onResponse(null);
return;
}

and we swallow rejections here:

assert publicationMayFail() || (exception instanceof EsRejectedExecutionException esre && esre.isExecutorShutdown())
: exception;
clusterStateUpdateStatsTracker.onPublicationFailure(
threadPool.rawRelativeTimeInMillis(),
clusterStatePublicationEvent,
0L
);
handleException(summary, publicationStartTime, newClusterState, exception);

This behaviour has existed for a long time (i.e. it was not introduced by recent changes in the area such as #92021 and #94325) but I still think we should improve it. Note however that it does not work simply to fail the ongoing tasks on rejection: today with acked tasks we call (at most) one of onAllNodesAcked(), onAckFailure(), onAckTimeout(), or ClusterStateTaskListener#onFailure(), and implementations rely on this fact, but we may experience a rejection exception after acking has completed. I think that means we have to delay the acking until the end of the publication, because the alternative would be to suppress onFailure() calls for acked tasks which seems like a confusing API choice that will lead to bugs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Distributed Coordination/Cluster CoordinationCluster formation and cluster state publication, including cluster membership and fault detection.>bugTeam:Distributed (Obsolete)Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions