Skip to content

Conversation

@gwbrown
Copy link
Contributor

@gwbrown gwbrown commented Mar 31, 2021

This commit hooks up the Node Shutdown API to the Node Shutdown cluster
metadata, so using the API will result in the appropriate writes to the
cluster state.

Relates #70338

This commit hooks up the Node Shutdown API to the Node Shutdown cluster
metadata, so using the API will result in the appropriate writes to the
cluster state.
@gwbrown gwbrown requested a review from dakrone April 1, 2021 22:22
@gwbrown gwbrown marked this pull request as ready for review April 1, 2021 22:22
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Apr 1, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left three comments, but nothing major!

* @return All {@link SingleNodeShutdownMetadata}s that currently exist.
*/
public List<SingleNodeShutdownMetadata> getAllNodeMetadata() {
return new ArrayList<>(nodes.values());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not return the Map here rather than making a list from the values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that at one point, but there's not much additional value in exposing the map I think. If we need to do so, it's easy enough to change.

I don't care strongly about this one way or the other, should I change it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer the Map because I think it'll be a bit easier when testing since we can do get-by-id for nodes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll remove the getNodeMetadata(nodeId) method above then and replace it with map access as well.

Comment on lines 59 to 62
// Verify that the requested node actually exists in the cluster
if (state.nodes().nodeExists(request.getNodeId()) == false) {
throw new IllegalArgumentException("there is no node with id [" + request.getNodeId() + "] in this cluster");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have this constraint, for example, if you need to add a shutdown for a node that is having connectivity issues, we should still allow marking it as shut down so that data can be migrated, rather than playing whack-a-mole waiting for it to be part of the cluster so that it can be marked as shutting down.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I wasn't thinking about that when I wrote this. It would be nice to have some kind of validation, so that if you completely fumble the node ID it doesn't just silently accept that.

Looking around at the code though, it doesn't look like there's much (any?) validation of node IDs - a lot of tests just use arbitrary strings, so maybe there's nothing we can really do here (besides maybe a length limit?).

Comment on lines 74 to 77
// Verify that there's not already a shutdown metadata for this node
if (Objects.nonNull(currentShutdownMetadata.getNodeMetadata(request.getNodeId()))) {
throw new IllegalArgumentException("node [" + request.getNodeId() + "] is already shutting down");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow a change in type or reason for a shutdown? I think it'd be nice to be able to change a restart type to removal if we decide that we wanted to remove the node after all.

For now though, we can keep this and maybe think about revisiting it in the future, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we've discussed being able to change the type, at least in one direction. That's a pretty easy follow-up though, so I think we should move forward with this and add that functionality when we need it.

I opened #71305 to keep track.

@gwbrown gwbrown merged commit f9d4991 into elastic:master Apr 6, 2021
gwbrown added a commit to gwbrown/elasticsearch that referenced this pull request Apr 6, 2021
This commit hooks up the Node Shutdown API to the Node Shutdown cluster
metadata, so using the API will result in the appropriate writes to the
cluster state.
gwbrown added a commit that referenced this pull request Apr 12, 2021
This commit hooks up the Node Shutdown API to the Node Shutdown cluster
metadata, so using the API will result in the appropriate writes to the
cluster state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Node Lifecycle Node startup, bootstrapping, and shutdown >non-issue Team:Core/Infra Meta label for core/infra team v7.13.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants