-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-16830. Add public IOStatistics API. #2323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16830. Add public IOStatistics API. #2323
Conversation
Change-Id: Ic0cdd0c3707066fff8b21f9cca6e7265a547267c
|
I see the problem. This PR only includes the common changes, but as that moved some fs.impl methods to being public in a new package, the uses in the s3 code won't build. Briefly I will reinstate the delete class from fs.impl so everything builds again |
This is to allow hadoop-aws to compile without needing to move to the moved interfaces; it will also allow anything external which used those methods (unlikely) to keep working. Also: throw synchronized at methods to get findbugs to STFU. Tempting just to turn it off on the basis that it is overkill, but it could maybe be correct. Making all of MeanStatistics synchronized is irritating, as it will hurt performance on what should be a lightweight class. But it is needed to ensure samples and sum are consistently set. Change-Id: I4c3e2726e1e97d705c55dbb43a507ea4d0e81e28
Change-Id: I5f64704a82a196fd8ff66cbde10d4970722e1fd7
|
findbugs The synchronized was lifted from TreeMap writeObject. When I turn it off I'll be reasonably confident that findbugs will start complaining about unsynchronized accesses to field. Let's see. If it blows up then I'll just disable the check here checkstyles are on the uses of the now deprecated methods; In the full s3a patch these are all migrated; in this PR I'm restricting all changes to hadoop-common, so will not fixe them in this patch |
Change-Id: I3e9bb83bc32eddc5c7d84c1df7be77cea3e60f5d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For DurationTrackers in IOStatisticsStore() if we add a tracker in a try block, what happens to it in case of failure should be looked at to avoid inaccurate values for the trackers.
| } | ||
|
|
||
| @Override | ||
| public long incrementCounter(final String key, final long value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check if value>0 before incrementing to avoid negative values for counters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make a no-op. And test this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed now
I was thinking about failure reporting myself
At the same time, try-with-resources is nice. What to do? For each set of duration stats, we add counter/mean/min/max of failures Issue: how best to record a failure, given we can't get at the try-with-resources classes in catch or finally? I'd initially thought we could set it in the catch(), but it'd be out of scope.
Fancy lambda-expression wrapper thing? Doable. Then in that code we'd put the code of option #2 in Fancy curried-function-Haskell-elitism option Duration tracker takes a function and returns a new one you'd get a function back which you could then apply at leisure. Maybe worth doing both. I could also look at adding into the S3A Invoke code, as every iteration of a retried operation we'd want the statistic updated. That is where curried functions come out to play, with something like invoker.invoke(durationTracker.track("listings", () -> s3a.list()). apply At the same time: this gets complex fast. Could we make the design of this a followup? |
|
Update: you will get the ability to wrap a Callable, CallableRaisingIOE or functionRaisingIOE with a duration tracking, which then returns a new instance of the same submit(unboundedThreadPool,
trackDurationOfCallable(
listingContext.getDurationTrackerFactory(),
OBJECT_LIST_REQUEST,
() -> listObjects(request)));When there's a failure we update the failure count, move timings into the failures section. Attempts are counted for both, so success == attempts - failures. |
- failures are registered and reported differently - helper in statistics.impl to wrap functions and callable with handling for this Move BufferedIOStatistics streams from impl to fs.statistics, so its declared OK for others to use counter increment to ignore negative values; returns existing value. successful increment returns the latest value, as documented. Tests for the counter and gauge behaviour Change-Id: Ic851240ebe94033bf93e0a11abefb7adda534191
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a doubt regarding getter for Min, Max, and Mean values for DurationTrackers.
for e.g ioStatisticsStore.getMeanStatistic(<STAT_NAME_FOR_DURATIONTRACKER>).mean();
should return a double value for the duration tracker but am met with a java.lang.NullPointerException: unknown statistic <STAT_NAME_FOR_DURATIONTRACKER>
error. This is solved when I add ".mean" at the end of the DurationTracker stat name. The same with Max(".max" has to be added) and Min(".min" has to be added) to fetch their values.
Was thinking if we could add it to the getter rather than having to append with stat name.
| return 0; | ||
| } | ||
| if (value < 0) { | ||
| LOG.debug("Ignoring negative incrment value {} for counter {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in debug statement (incrment).
I'm a bit reluctant to, as then people would ask about failures next. What could be handy would be something in the support class to get all stats for a duration (or null), with some struct to contain them all. e.g
|
on hasNext/Next, close() is called, so as to cleanup any open state/handles etc. Avoids having to tell developers downstream to look for an iterator being closeable, and call it if so. + minor improvements in iterator test suite + fix typo found by Mehakmeet Change-Id: Ibd103941278b8c0bc6c93a09ea469be4c60a58b1
|
In IOStatisticsBinding class we have methods for tracking duration but, I am not able to wrap it around a normal function. |
+ add class which can be used to store duration results, primarily for testing. Made public for others to play with too Change-Id: I0a7f274f5a2a180e1800102be177b308050725c0
|
|
Yetus: javac is all about the deprecation changes fixed in #2324 checkstyles from new tests & code will fix those |
* move the methods in fs.impl.FutureIOSupport which turn out to be needed when working with the async code to the public utils.functional package * duration tracking invocation in IOStatisticsBinding tuning based on the ABFS coding experience. Change-Id: Ide9467fa67a8a5d3d8daec94a7a39ff87f106a47
Fix checkstyle in RemoteIOException unwrapping of exception looks for Error and rethrows Change-Id: I673b1e487cae2ec3204ce6ba1103cdae14cfe48c
|
javac failures on newly deprecated static fs.impl.FutureIOSupport methods I'm retaining to avoiding breaking things; checkstyle on 4 javadoc lines which are too wide because they reference other classes |
* the trackDuration(DurationTrackerFactory,...) methods support null factory and switch to a stub tracker * there's an overloaded logIOStatisticsAtDebug() method to use the IOStatisticsLogging own logger Change-Id: Ie786dc7aa8920a3fc8b22b55916f4b811810dab3
This comment has been minimized.
This comment has been minimized.
|
build failure was in native IO. full PR in #2324 didn't show this. I'd rebase this PR except it would complicate everything else too much |
* PairedDurationTrackerFactory allows listing code to update both FS and instance level stats. * test assertions to become verifyStatistic* assertStatistic* to make clear these are statistic assertions; also helps code completion * stream_read_unbuffered stream key Change-Id: I25272fffeb3e66b5cec90ae6c6af74808c139b26
|
Also, should use java.io.UncheckedIOException for the wrapper class for an IOE. That's new in Java8, but as its public API there, what we should adopt. This is good, even if its a bit more work |
|
This PR won't merge right now due to the move to shaded Preconditons. The PR #2324 has been rebased; I'll be rebuilding this PR from that one as a trunk + one big fat patch |
|
Closing as it's impossible to maintain the interface and impl branches. The combined -common and -aws changes make for a bigger pr, but each patch has gone in separately, It will be possible to merge the changes into Hadoop as two separate patches |
Hadoop-common side of the patch