-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add has_voting_exclusions flag to cluster health output #38568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I asked for a few docs changes and noted the BWC issue that we face here.
I also wonder, is it easy to only include this field in the JSON output if true, so that in the common case we wouldn't see any change?
| unassignedShards = in.readVInt(); | ||
| numberOfNodes = in.readVInt(); | ||
| numberOfDataNodes = in.readVInt(); | ||
| if (in.getVersion().onOrAfter(Version.V_7_0_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time this lands, master will be V_8_0_0, and you will have to do the BWC dance to backport it to 7.x and 7.0. I also think this counts as a feature and therefore it shouldn't be backported to 7.0?
The same comment applies to other places that look like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have done the following things:
- Repurposed the PR to 8.0 and 7.1
- Disabled bwc tests for PR that goes to 8.0
- Added TODO to change V_7_0_0 once Add 7.1 version constant to 7.x branch #38513 is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now because #38513 is merged, I've changed the version to V_7_1_0
| <7> Number of shards | ||
| <8> Number of replicas | ||
| <9> Index status | ||
| <5> Check if there are voting exclusions in the cluster state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could voting exclusions here be a link to the section entitled Voting configuration exclusions API, and could you add a
NOTE: Clusters should have no voting configuration exclusions in normal operation.
to those API docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -------------------------------------------------- | ||
| <1> Number of nodes in the cluster | ||
| <2> Number of data nodes in the cluster | ||
| <3> Whether cluster has voting exclusions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could voting exclusions here be a link to the section entitled Voting configuration exclusions API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
@DaveCTurner it's ready for the second pass. |
DaveCTurner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection I think we shouldn't do this, sorry. Although it might sometimes be very useful to see this flag in cluster health, I think we could probably say the same about many other settings. I think it's enough to expose this in the support diagnostics tooling as planned elsewhere.
This PR adds "has_voting_exclusions" flag to cluster health output.
Today voting exclusions can be already obtained by inspecting cluster_state.metadata.coordination_metadata. However, we want to make them visible, because incorrectly configured voting exclusions can render cluster non-functional. Having voting exclusions temporary is ok, however, once desired nodes are stopped, we expect users to remove them from the cluster state.
Cluster health output is the first thing users are looking at and sharing with others when trying to troubleshoot, so we decided to include this information in cluster health.