Expose shard migration status in Node Shutdown Status API #73873

gwbrown · 2021-06-08T01:03:50Z

This PR modifies the Node Shutdown Status API to include information about the number of shards left to migrate off of the node in question, as well as checking if shard migration is stalled.

"Stalled" is defined as no shards currently relocating off the node, and at least one shard on the node which cannot move per current allocation deciders.

Relates #70338

…ling

gwbrown · 2021-06-10T00:29:47Z

@dakrone I'm going to request your review on this because the production code and unit tests are reviewable, but I'm still intending to add integration tests - but I expect those to be pretty straightforward, and all the behavior should already be covered by the unit tests.

dakrone

I left a comment on this, I think we need to handle INITIALIZING shards as part of the count also, what do you think?

dakrone · 2021-06-10T20:54:47Z

server/src/main/java/org/elasticsearch/cluster/metadata/ShutdownShardMigrationStatus.java


-    public ShutdownShardMigrationStatus() {
-        this.status = SingleNodeShutdownMetadata.Status.COMPLETE;
+    public ShutdownShardMigrationStatus(SingleNodeShutdownMetadata.Status status, long shardsRemaining) {


I think at some point we should just pull SingleNodeShutdownMetadata.Status into its own top-level class named something like ShutdownComponentStatus or something, since it's being used quite a few places (not in this PR though)

dakrone · 2021-06-10T20:58:23Z

...hutdown/src/main/java/org/elasticsearch/xpack/shutdown/TransportGetShutdownStatusAction.java

+    ) {
+        // Only REMOVE-type shutdowns will try to move shards, so RESTART-type shutdowns should immediately complete
+        if (SingleNodeShutdownMetadata.Type.RESTART.equals(shutdownType)) {
+            return new ShutdownShardMigrationStatus(SingleNodeShutdownMetadata.Status.COMPLETE, 0);


Might as well include a "reason" string here for users, it'll be helpful for users that might not be aware that restarts don't migrate shards

Good idea, thanks!

dakrone · 2021-06-10T21:01:39Z

...hutdown/src/main/java/org/elasticsearch/xpack/shutdown/TransportGetShutdownStatusAction.java

+        int currentShardsOnNode = currentState.getRoutingNodes().node(nodeId).numberOfShardsWithState(ShardRoutingState.STARTED);
+        int currentlyRelocatingShards = currentState.getRoutingNodes().node(nodeId).numberOfShardsWithState(ShardRoutingState.RELOCATING);
+        int totalRemainingShards = currentlyRelocatingShards + currentShardsOnNode;


I think this is missing dealing with INITIALIZING shards, which may be assigned but are part of the total? Should those be considered as shards currently on the node that need to be migrated?

Very good question. Since the allocation decider should prevent any new shards from being allocated to the node, the only INITIALIZING shards should be ones which were already assigned to the node before the shutdown was registered, but there still might be some.

In what circumstances would shards stay INITIALIZING for a long time? I'm not sure if snapshot restoration shows up as that, or another status - do you know off the top of your head?

INITIALIZING shards can't be relocated per my understanding, so we'd have to wait until they move to STARTED... and then would they relocate? Or would we have to trigger a reroute again? I think I'm going to have to ask the distributed team about some of this.

In what circumstances would shards stay INITIALIZING for a long time? I'm not sure if snapshot restoration shows up as that, or another status - do you know off the top of your head?

Initial recovery during startup as well as opening a closed index moves shards to initializing I think, and I think if a user set the check_on_startup setting it can take quite a while. I don't know about snapshot restoration unfortunately.

INITIALIZING shards can't be relocated per my understanding, so we'd have to wait until they move to STARTED... and then would they relocate? Or would we have to trigger a reroute again?

I believe they cannot be relocated during INITIALIZING state (it would make sense you can't relocate a shard recovering from local disk), and I think they will be subject to a relocation/reroute once they've reached the STARTED state, but don't hold me to that :)

Henning told me elsewhere that (among other things):

If we have cases of shards getting "stuck" in INITIALIZING, I would call it a bug, but we have seen cases of slowness (which is possibly also a bug).

So we shouldn't get cases of shards being stuck in INITIALIZING indefinitely, but there may be some where they are very slow to move out of that state. So this now counts initializing shards towards the shards remaining count, and if there are only initializing shards left, reports IN_PROGRESS with a reason explaining that there are only INITIALIZING shards left. That should help surface unexpected conditions more quickly. I'm not sure we can get much more detail there easily, though.

elasticmachine · 2021-06-15T23:15:31Z

Pinging @elastic/es-core-infra (Team:Core/Infra)

dakrone

LGTM, I left two really minor comments, but otherwise looks good!

dakrone · 2021-06-16T22:01:36Z

...hutdown/src/main/java/org/elasticsearch/xpack/shutdown/TransportGetShutdownStatusAction.java

+            snapshotsInfoService.snapshotShardSizes(),
+            System.nanoTime()
+        );
+        allocation.debugDecision(true);


I think (at least, I can't see it anywhere) that we don't use the actual message for the decision anywhere. Since we don't, I think we can remove this line and avoid the debugging decision (which allows allocation deciders to short-circuit and be a bit more efficient).

Either that, or maybe we can log the the actual decision (which will include a plethora of details) at TRACE level if an unmoveable shard is found? That would be nice for debugging I think?

I'm not exactly sure why, but calling allocationService.explainShardAllocation() with a RoutingAllocation that doesn't have debugDecision set to true trips this assertion. I think that might just be because the only other place that calls this method is the Allocation Explain API, so we might be able to change it but I didn't want to do so without consulting the distributed team. Which could be a follow-up PR?

In the mean time, I'll change this to at least log the decision at TRACE.

also, you can use:

allocation.setDebugMode(DebugMode.EXCLUDE_YES_DECISIONS);

In place of the allocation.debugDecision(true) to only return the "NO"s, which will cut down on the amount of logging also (while still being in debug mode)

dakrone · 2021-06-16T22:09:26Z

...hutdown/src/main/java/org/elasticsearch/xpack/shutdown/TransportGetShutdownStatusAction.java

+            .filter(pair -> pair.v2().getMoveDecision().getAllocationDecision().equals(AllocationDecision.THROTTLED) == false)
+            // These shards will move as soon as possible
+            .filter(pair -> pair.v2().getMoveDecision().getAllocationDecision().equals(AllocationDecision.YES) == false)
+            .findFirst();


If you don't want to use the allocation decision (from my comment above, because you don't use the v2() anywhere but for filtering), this could just be:

.map(Tuple::v1) .findFirst();

And then it only needs to be an Optional<ShardRouting> above

Good call, thanks!

) This commit modifies the Node Shutdown Status API to include information about the number of shards left to migrate off of the node in question, as well as checking if shard migration is stalled. "Stalled" is defined as no shards currently relocating off the node, and at least one shard on the node which cannot move per current allocation deciders. Relates elastic#70338

This commit modifies the Node Shutdown Status API to include information about the number of shards left to migrate off of the node in question, as well as checking if shard migration is stalled. "Stalled" is defined as no shards currently relocating off the node, and at least one shard on the node which cannot move per current allocation deciders. Relates #70338

gwbrown added 3 commits June 7, 2021 18:37

First cut of plumbing shard migration status

83ce7b0

Merge branch 'master' into decom/shard-status-plumbing

584d53d

Add comma dropped in merge

fb14fa9

gwbrown added >non-issue v8.0.0 :Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown v7.14.0 labels Jun 8, 2021

gwbrown added 9 commits June 7, 2021 19:08

Only calculate status for REMOVE shutdowns

44958be

Actually fill in the new parameter, oops

908da9f

Getters for the new shard migration status fields

4de332a

Static-ify shard migration status for testing, plus fix throttle hand…

210c76c

…ling

Unit tests!

d3e3374

More unit tests + formatting

7127029

Spotless

61bacf8

Merge branch 'master' into decom/shard-status-plumbing

a25f84c

Someone moved my elasticsearch.common cheese!

63e525e

gwbrown requested a review from dakrone June 10, 2021 00:29

dakrone reviewed Jun 10, 2021

View reviewed changes

gwbrown added 3 commits June 14, 2021 11:45

Add reason for RESTART-type shutdowns

71ecd96

Check shard migration status in existing test

0c37612

Status integration test

cd47d51

gwbrown mentioned this pull request Jun 14, 2021

Add node shutdown API for shutting down nodes cleanly #70338

Closed

22 tasks

gwbrown added 5 commits June 14, 2021 15:41

Update test for RESTART shard migration reason

93d3b8c

Merge branch 'master' into decom/shard-status-plumbing

c58a7eb

Count remaining INTIALIZING shards

e0cfe4b

spotless

9bb00c0

Merge branch 'master' into decom/shard-status-plumbing

3f9ec17

gwbrown marked this pull request as ready for review June 15, 2021 23:15

elasticmachine added the Team:Core/Infra Meta label for core/infra team label Jun 15, 2021

gwbrown requested a review from dakrone June 15, 2021 23:15

dakrone approved these changes Jun 16, 2021

View reviewed changes

gwbrown added 6 commits June 16, 2021 16:21

Simplify optional per review

c783faa

Log decision if an unmovable shard is found

0b66dc2

Imports

4e5be98

Merge branch 'master' into decom/shard-status-plumbing

b6f7e65

Use EXCLUDE_YES_DECISIONS per review

77caf29

reason -> explanation for consistency with other APIs

9f24f17

gwbrown merged commit 335266c into elastic:master Jun 17, 2021

gwbrown added the backport pending label Jun 17, 2021

gwbrown mentioned this pull request Jun 17, 2021

[7.x] Expose shard migration status in Node Shutdown Status API (#73873) #74269

Merged

gwbrown removed the backport pending label Jun 17, 2021

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

arteam mentioned this pull request Jul 20, 2022

Add a check for unassigned shards residing on the node being removed in the shutdown status check #88635

Closed

Expose shard migration status in Node Shutdown Status API #73873

Expose shard migration status in Node Shutdown Status API #73873

Uh oh!

Conversation

gwbrown commented Jun 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gwbrown commented Jun 10, 2021

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticmachine commented Jun 15, 2021

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dakrone Jun 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gwbrown commented Jun 8, 2021 •

edited

Loading

dakrone Jun 16, 2021 •

edited

Loading