Make ILM aware of node shutdown #73690

dakrone · 2021-06-02T17:11:17Z

This commit makes ILM aware of different parts of the node shutdown lifecycle. It consists are two
main parts, reacting to the state during execution, and signaling the status of shutdown from ILM.

Reacting to shutdown state
ILM now considers nodes that are going to be shut down when deciding which node to assign for the
shrink action. It uses the NodeShutdownAllocationDecider within the SetSingleNodeAllocateStep to
not assign shards to a node that will be removed. If an index is already past this step and waiting
for allocation, this commit adds an isCompletable method to the
ClusterStateWaitUntilThresholdStep so that an allocation that cannot happen can be rewound and
retried on another (non-shutdown) node.

Signaling shutdown status
This commit introduces the PluginShutdownService which deals with ShutdownAwarePlugin classes.
This class is used to signal shutdowns to plugins, and also to gather the status of a shutdown from
these plugins. ILM implements this ShutdownAwarePlugin to signal if an index is in a step that is
unsafe, such as the actual shrink step, so that shutdown will wait until after the allocation rules
have been removed by ILM.

This commit also hooks up the get shutdown API response to consider the statuses of its parts (see
SingleNodeShutdownMetadata.Status#combine) when creating a response.

Relates to #70338

This commit makes ILM aware of different parts of the node shutdown lifecycle. It consists are two main parts, reacting to the state during execution, and signaling the status of shutdown from ILM. Reacting to shutdown state ILM now considers nodes that are going to be shut down when deciding which node to assign for the shrink action. It uses the `NodeShutdownAllocationDecider` within the `SetSingleNodeAllocateStep` to not assign shards to a node that will be removed. If an index is already past this step and waiting for allocation, this commit adds an `isCompletable` method to the `ClusterStateWaitUntilThresholdStep` so that an allocation that cannot happen can be rewound and retried on another (non-shutdown) node. Signaling shutdown status This commit introduces the `PluginShutdownService` which deals with `ShutdownAwarePlugin` classes. This class is used to signal shutdowns to plugins, and also to gather the status of a shutdown from these plugins. ILM implements this `ShutdownAwarePlugin` to signal if an index is in a step that is unsafe, such as the actual shrink step, so that shutdown will wait until after the allocation rules have been removed by ILM. This commit also hooks up the get shutdown API response to consider the statuses of its parts (see `SingleNodeShutdownMetadata.Status#combine`) when creating a response. Relates to elastic#70338

elasticmachine · 2021-06-02T17:11:20Z

Pinging @elastic/es-core-features (Team:Core/Features)

elasticmachine · 2021-06-02T17:11:20Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

dakrone · 2021-06-02T18:12:05Z

@elasticmachine run elasticsearch-ci/packaging-tests-windows-sample

andreidan

Thanks Lee.

This generally looks great, I had one question.

andreidan · 2021-06-03T09:45:32Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckShrinkReadyStep.java

+            .map(nmm -> nmm.get(idShardsShouldBeOn))
+            .map(snsm -> snsm.getType() == SingleNodeShutdownMetadata.Type.REMOVE)


maybe a personal preference so feel free to ignore, but I find it very difficult to read nmn, snsm, and c below in IndexLifecycleService

Okay, I've renamed these to hopefully be better

andreidan · 2021-06-03T09:50:13Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/CheckShrinkReadyStep.java

+        if (nodeBeingRemoved) {
+            completable = false;
+            return new Result(false, new SingleMessageFieldInfo("node with id [" + idShardsShouldBeOn +
+                "] is currently marked as shutting down for removal"));
+        }


Would it make sense to move this check below, and only execute it if we did NOT already relocate all the necessary shards to the target node here ?

If we're ready to execute shrink should the shutdown wait for the shrink action/task to finish? IndexLifecycleService already signals that shutdown should not be executed if we're in this step (in readyToShutdown)

As opposed to allowing the shutdown to continue and then re-doing the shard allocation to another node? (given DTS issues or, if in the same zone, generally moving GBs of data in the cluster could be avoided this way, but maybe it's impractical from the node shutdown infrastructure perspective?)

What do you think?

Hmm.. I would say they are both around the same, so maybe it's better to do the check afterwards. This would mean that we'd prevent shutdown for slightly longer, but avoid extra relocation if the allocation were already complete

I moved this check down below and added a test for the differing behavior

gwbrown · 2021-06-03T19:55:43Z

server/src/main/java/org/elasticsearch/cluster/metadata/SingleNodeShutdownMetadata.java

+            int statusOrd = -1;
+            for (Status status : statuses) {
+                // Max the status up to, but not including, "complete"
+                if (status != COMPLETE) {
+                    statusOrd = Math.max(status.ordinal(), statusOrd);
+                }
+            }
+            if (statusOrd == -1) {
+                // Either all the statuses were complete, or there were no statuses given
+                return COMPLETE;
+            } else {
+                return Status.values()[statusOrd];
+            }


I really want to be a functional programming dork and tell you to use reduce here, but I think this is actually clearer than what you'd have to do to make reduce work.

gwbrown

Left the tiniest nitpick, otherwise LGTM now that you've addressed Andrei's comments.

x-pack/plugin/ilm/src/test/java/org/elasticsearch/xpack/ilm/IndexLifecycleServiceTests.java

andreidan

LGTM, thanks Lee

droberts195 · 2021-06-04T14:21:43Z

server/src/main/java/org/elasticsearch/shutdown/PluginShutdownService.java

+    /**
+     * Check with registered plugins whether the shutdown is safe for the given node id and type
+     */
+    public boolean readyToShutdown(String nodeId, SingleNodeShutdownMetadata.Type shutdownType) {


I wonder if it's worth having this method take an extra ClusterState parameter, and pass that as an extra argument to each of the plugin.safeToShutdown calls it makes. It will make it easier for plugins that don't currently have their own ClusterService reference to implement the interface.

We discussed adding these today, but since there's no current user, we're going to keep the cluster state out of the interface for now, and revisit it when ML (or a different plugin) has an implementation where they need these

OK cool. I am going to work on the ML PR soon, so can add the arguments to that if you don't have a fundamental objection.

so can add the arguments to that if you don't have a fundamental objection.

If possible, I think I'd prefer to keep them out of the interface. Especially for the signalShutdown method, if the cluster state is added it's essentially no different than a regular ClusterStateListener call. I think it's okay to add it to the readyToShutdown because that is a one-off call (not called on every cluster state change).

Would that work for you?

Since both methods are implemented by the same class, it doesn't really help to just add the current cluster state to one of them. I can instead add a reference to the ClusterService to the class that implements the interface, and then that can be used in both methods.

That would be my preference then, as long as that isn't too distasteful of a solution for you.

droberts195 · 2021-06-04T14:23:51Z

server/src/main/java/org/elasticsearch/shutdown/PluginShutdownService.java

+        Set<String> shutdownNodes = shutdownNodes(state);
+        for (ShutdownAwarePlugin plugin : plugins) {
+            try {
+                plugin.signalShutdown(shutdownNodes);


Similarly, it would be nice if state was passed as an extra argument here, so that plugins that don't currently have their own reference to ClusterService can look at the current cluster state.

droberts195 · 2021-06-04T14:25:14Z

server/src/main/java/org/elasticsearch/plugins/ShutdownAwarePlugin.java

+     * Whether the plugin is considered safe to shut down. This method is called when the status of
+     * a shutdown is retrieved via the API, and it is only called on the master node.
+     */
+    boolean safeToShutdown(String nodeId, SingleNodeShutdownMetadata.Type shutdownType);


Please consider adding an extra ClusterState parameter to this method. I think almost every implementation of this will involve looking for something in the cluster state.

droberts195 · 2021-06-04T14:25:45Z

server/src/main/java/org/elasticsearch/plugins/ShutdownAwarePlugin.java

+     * A trigger to notify the plugin that a shutdown for the nodes has been triggered. This method
+     * will be called on every node for each cluster state, so it should return quickly.
+     */
+    void signalShutdown(Collection<String> shutdownNodeIds);


Please consider adding an extra ClusterState parameter to this method. I think almost every implementation of this will involve looking for something in the cluster state.

This commit makes ILM aware of different parts of the node shutdown lifecycle. It consists are two main parts, reacting to the state during execution, and signaling the status of shutdown from ILM. Reacting to shutdown state ILM now considers nodes that are going to be shut down when deciding which node to assign for the shrink action. It uses the `NodeShutdownAllocationDecider` within the `SetSingleNodeAllocateStep` to not assign shards to a node that will be removed. If an index is already past this step and waiting for allocation, this commit adds an `isCompletable` method to the `ClusterStateWaitUntilThresholdStep` so that an allocation that cannot happen can be rewound and retried on another (non-shutdown) node. Signaling shutdown status This commit introduces the `PluginShutdownService` which deals with `ShutdownAwarePlugin` classes. This class is used to signal shutdowns to plugins, and also to gather the status of a shutdown from these plugins. ILM implements this `ShutdownAwarePlugin` to signal if an index is in a step that is unsafe, such as the actual shrink step, so that shutdown will wait until after the allocation rules have been removed by ILM. This commit also hooks up the get shutdown API response to consider the statuses of its parts (see `SingleNodeShutdownMetadata.Status#combine`) when creating a response. Relates to elastic#70338

* Make ILM aware of node shutdown (#73690) This commit makes ILM aware of different parts of the node shutdown lifecycle. It consists are two main parts, reacting to the state during execution, and signaling the status of shutdown from ILM. Reacting to shutdown state ILM now considers nodes that are going to be shut down when deciding which node to assign for the shrink action. It uses the `NodeShutdownAllocationDecider` within the `SetSingleNodeAllocateStep` to not assign shards to a node that will be removed. If an index is already past this step and waiting for allocation, this commit adds an `isCompletable` method to the `ClusterStateWaitUntilThresholdStep` so that an allocation that cannot happen can be rewound and retried on another (non-shutdown) node. Signaling shutdown status This commit introduces the `PluginShutdownService` which deals with `ShutdownAwarePlugin` classes. This class is used to signal shutdowns to plugins, and also to gather the status of a shutdown from these plugins. ILM implements this `ShutdownAwarePlugin` to signal if an index is in a step that is unsafe, such as the actual shrink step, so that shutdown will wait until after the allocation rules have been removed by ILM. This commit also hooks up the get shutdown API response to consider the statuses of its parts (see `SingleNodeShutdownMetadata.Status#combine`) when creating a response. Relates to #70338 * Adjust annotation Co-authored-by: Elastic Machine <[email protected]>

dakrone added :Data Management/ILM+SLM Index and Snapshot lifecycle management v8.0.0 :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown v7.14.0 labels Jun 2, 2021

dakrone requested review from andreidan and gwbrown June 2, 2021 17:11

elasticmachine added Team:Data Management Meta label for data/management team Team:Core/Infra Meta label for core/infra team labels Jun 2, 2021

dakrone mentioned this pull request Jun 2, 2021

Add node shutdown API for shutting down nodes cleanly #70338

Closed

22 tasks

dakrone added 2 commits June 2, 2021 11:38

Fix ClusterStateWaitUntilThresholdStepTests

b186c9e

Remove unused import

38a8b6c

andreidan reviewed Jun 3, 2021

View reviewed changes

gwbrown reviewed Jun 3, 2021

View reviewed changes

dakrone added 4 commits June 3, 2021 15:35

Use clearer names in lambdas

888eb44

Move node removal check to after shard allocation check

2d2b80c

Merge remote-tracking branch 'origin/master' into shutdown-proof-ilm

bf61186

Use descriptive name instead of 'c'

d0d017f

gwbrown approved these changes Jun 3, 2021

View reviewed changes

x-pack/plugin/ilm/src/test/java/org/elasticsearch/xpack/ilm/IndexLifecycleServiceTests.java Outdated Show resolved Hide resolved

Add message to assert in test

35833e0

andreidan approved these changes Jun 4, 2021

View reviewed changes

droberts195 reviewed Jun 4, 2021

View reviewed changes

dakrone merged commit 2bf2bdd into elastic:master Jun 7, 2021

dakrone deleted the shutdown-proof-ilm branch June 7, 2021 14:39

dakrone added the backport pending label Jun 7, 2021

dakrone mentioned this pull request Jun 7, 2021

[7.x] Make ILM aware of node shutdown (#73690) #73847

Merged

dakrone removed the backport pending label Jun 7, 2021

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

probakowski added the >enhancement label Jul 30, 2021

DaveCTurner mentioned this pull request Jul 31, 2023

Graceful node shutdowns in ESIntegTestCase should integrate with ShutdownAwarePlugin #98055

Open

		.map(nmm -> nmm.get(idShardsShouldBeOn))
		.map(snsm -> snsm.getType() == SingleNodeShutdownMetadata.Type.REMOVE)

Make ILM aware of node shutdown #73690

Make ILM aware of node shutdown #73690

Uh oh!

Conversation

dakrone commented Jun 2, 2021

Uh oh!

elasticmachine commented Jun 2, 2021

Uh oh!

elasticmachine commented Jun 2, 2021

Uh oh!

dakrone commented Jun 2, 2021

Uh oh!

andreidan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gwbrown left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andreidan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants