Skip to content

Conversation

@DaveCTurner
Copy link
Contributor

Today GET _nodes/stats/fs includes {least,most}_usage_estimate
fields for some nodes. These fields have rather strange semantics. They
are only reported on the elected master and on nodes that have been the
elected master since they were last restarted; when a node stops being
the elected master these stats remain in place but we stop updating them
so they may become arbitrarily stale.

This means that these statistics are pretty meaningless and impossible
to use correctly. Even if they were kept up to date they're never
reported for data-only nodes anyway, despite the fact that data nodes
are the ones where we care most about disk usage. The information needed
to compute the path with the least/most available space is already
provided in the rest the stats output, so we can treat the inclusion of
these stats as a bug and fix it by simply removing them in this commit.
Since these stats were always optional and mostly omitted (for opaque
reasons) this is not considered a breaking change.

Today `GET _nodes/stats/fs` includes `{least,most}_usage_estimate`
fields for some nodes. These fields have rather strange semantics. They
are only reported on the elected master and on nodes that have been the
elected master since they were last restarted; when a node stops being
the elected master these stats remain in place but we stop updating them
so they may become arbitrarily stale.

This means that these statistics are pretty meaningless and impossible
to use correctly. Even if they were kept up to date they're never
reported for data-only nodes anyway, despite the fact that data nodes
are the ones where we care most about disk usage. The information needed
to compute the path with the least/most available space is already
provided in the rest the stats output, so we can treat the inclusion of
these stats as a bug and fix it by simply removing them in this commit.
Since these stats were always optional and mostly omitted (for opaque
reasons) this is not considered a breaking change.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Stats)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jul 17, 2020
return current + other;
}

private double addDouble(double current, double other) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated unused method

return cache.getOrRefresh();
}

private static FsInfo stats(FsProbe probe, FsInfo initialValue, Logger logger, @Nullable ClusterInfo clusterInfo) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but logger was always the static field so no need to pass it in.

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@DaveCTurner DaveCTurner merged commit 7bb748d into elastic:master Jul 20, 2020
@DaveCTurner DaveCTurner deleted the 2020-07-17-remove-path-estimates-from-fs-stats branch July 20, 2020 13:48
@DaveCTurner
Copy link
Contributor Author

Thanks Yannick

DaveCTurner added a commit that referenced this pull request Jul 20, 2020
DaveCTurner added a commit that referenced this pull request Jul 20, 2020
Today `GET _nodes/stats/fs` includes `{least,most}_usage_estimate`
fields for some nodes. These fields have rather strange semantics. They
are only reported on the elected master and on nodes that have been the
elected master since they were last restarted; when a node stops being
the elected master these stats remain in place but we stop updating them
so they may become arbitrarily stale.

This means that these statistics are pretty meaningless and impossible
to use correctly. Even if they were kept up to date they're never
reported for data-only nodes anyway, despite the fact that data nodes
are the ones where we care most about disk usage. The information needed
to compute the path with the least/most available space is already
provided in the rest the stats output, so we can treat the inclusion of
these stats as a bug and fix it by simply removing them in this commit.
Since these stats were always optional and mostly omitted (for opaque
reasons) this is not considered a breaking change.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jul 20, 2020
Adjusts the version logic used for BWC and reinstates the BWC tests
DaveCTurner added a commit that referenced this pull request Jul 20, 2020
Adjusts the version logic used for BWC and reinstates the BWC tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Data Management/Stats Statistics tracking and retrieval APIs Team:Data Management Meta label for data/management team v7.10.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants