-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Report timing stats as part of the Job stats response #42709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report timing stats as part of the Job stats response #42709
Conversation
|
Pinging @elastic/ml-core |
|
run elasticsearch-ci/bwc |
acf1415 to
ca04d84
Compare
|
I think you need to add a method Additionally, the new field names should go in Finally, are you still planning to add another field for some sort of exponential moving average in addition to the average over all time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be version-aware so it works in a mixed version cluster:
if (out.getVersion().onOrAfter(V_7_3_0)) {
out.writeOptionalWriteable(timingStats);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be version-aware so it works in a mixed version cluster:
if (in.getVersion().onOrAfter(V_7_3_0)) {
timingStats = in.readOptionalWriteable(TimingStats::new);
} else {
timingStats = null;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0) immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195: Done.
@benwtrent: Any hints on how to add BWC for job stats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@przemekwitek you should be able to add them to the BWC YAML tests. These are in xpack.qa.rolling-upgrade.resources. Probably cases added in 30_ml_jobs_crud.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed BWC tests with @dimitris-athanasiou and he suggested that it will be easier if I add them (in a separate PR) after merging in this PR into master and backporting into 7.x. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will be much easier to add the BWC tests in a follow-up PR. Otherwise you'd have to add them on the basis that 7.x doesn't contain this functionality, then adjust both master and 7.x once 7.x does contain this functionality. So I agree with the idea of adding them in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0) immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketCount is not optional. If it wrote a null there would be errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please make 0.1 a private constant so that it is self-documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I've also removed the "p" parameter as it is not used in any meaningful way at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketCount is not optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#nit
| TimingStats timingStats = null; | |
| TimingStats timingStats = randomBoolean() ? null : TimingStatsTests.createTestInstance("foo"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (refactored the whole method this way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we disallow negatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| this.timingStats = timingStats; | |
| this.timingStats = new TimingStats(timingStats); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate on why do you think this change is needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@przemekwitek TimingStats has a method that allows its internal state to mutate. Not doing a copy would allow the parameter timingStats#updateStats to be called AFTER it is passed to this method causing the Builder#timingStats to be updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that. It's just this setter in builder is only called once and the temporary object is not saved anywhere else:
paramsBuilder.setTimingStats(parseSearchHit(hit, TimingStats.PARSER, errorHandler));
But ok, I applied your suggestion to prevent some random refactoring of breaking this assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.
Done.
Done.
Yes, but I'd rather do it in a separate PR as I wanted to check in and verify the base work first. |
|
run elasticsearch-ci/2 |
3ab9022 to
dc385ac
Compare
droberts195
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Initial version of timing stats reporting.
TimingStats object returned as part of the GetJobStats response contains 3 numbers: min, max and avg of bucket processing times.
Additionally, it contains bucket count. This bucket count may differ from the bucket count stored in DataCounts.
Related issue: #29857