- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.4k
HBASE-27483 Expose table and region storefiles accessed days and size to the metrics #4891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| 💔 -1 overall 
 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an interesting change, very nice.
can you fix the spotbug error ?
| public static long ONE_DAY_MS = 24 * 3600 * 1000; | ||
| public static final String STOREFILES_ACCESSED_DAYS_THRESHOLDS = | ||
| "hbase.region.storefiles.accessed.days.thresholds"; | ||
| public static final int[] STOREFILES_ACCESSED_DAYS_THRESHOLDS_DEFAULT = { 7, 30, 90 }; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add few line of javadoc or some line in README.md how to use this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question, if we are using cache / bucket cache heavily, will this metric retain at the time the store file flushes or opened for cache? It’s fine as is without caring the cache , just want to learn more here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in this situation, the metrics will retain at the time the store file flushes or opened for cache, although the cached data of store file is accessed at a new time. I'm not sure whether it is critical issue.
| @Override | ||
| public void run() { | ||
| Map<TableName, MetricsTableValues> localMetricsTableMap = new HashMap<>(); | ||
|  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] is this added by spotless?
| metricsHelper.assertGaugeGt(prefix + "storeFilesAccessed7DaysSize", 0, agg); | ||
| metricsHelper.assertGaugeGt(prefix + "storeFilesAccessed30DaysSize", 0, agg); | ||
| metricsHelper.assertGaugeGt(prefix + "storeFilesAccessed90DaysSize", 0, agg); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] is there any way to verify non-zero metric for storeFilesAccessed%sDaysSize in test, e.g. if we can mock the access time ?
| @taklwu sir. Thanks for your review. Please wait for my new commit. | 
| @taklwu sir, I'm sorry to have kept you waiting so long. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| @taklwu Could you take a look? Thanks. | 
| String ROW_READS_ONLY_ON_MEMSTORE_DESC = "Row reads happening completely out of memstore"; | ||
| String MIXED_ROW_READS = "mixedRowReadsCount"; | ||
| String MIXED_ROW_READS_ON_STORE_DESC = "Row reads happening out of files and memstore on store"; | ||
| String STOREFILES_ACCESSED_DAYS_AND_SIZE = "storeFilesAccessed%sDaysSize"; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better name it XXX_TEMPLATE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sir
|  | ||
| Map<String, Pair<Long, Long>> sfAccessTimeAndSizeMap = | ||
| store.getStoreFilesAccessTimeAndSize(); | ||
| Long now = EnvironmentEdgeManager.currentTime(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use long instead of Long as later we will do calculation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use
longinstead ofLongas later we will do calculation?
OK, sir.
        
          
                hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java
          
            Show resolved
            Hide resolved
        
      a59d412    to
    c21b0ab      
    Compare
  
    | 🎊 +1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| The failed UTs seems to have nothing to do with my modification. | 
| @Apache9 sir. Could you take a look? Thanks. | 
| Please fix the conflicts. And how do you plan to reduce the pressure on NN when fetching the accessTime? Or is this not a big problem? Thanks. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 
 @Apache9  sir. I have a new commit. Could you take a look? Thanks. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| mt.perStoreMixedReadCount.put(tempKey, mixedReadCount); | ||
| Map<String, Pair<Long, Long>> sfAccessTimeAndSizeMap = | ||
| store.getStoreFilesAccessTimeAndSize(); | ||
| Long now = EnvironmentEdgeManager.currentTime(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use long instead of Long?
| Long size = pair.getSecond(); | ||
| for (int threshold : storeFilesAccessedDaysThresholds) { | ||
| Long sumSize = mt.storeFilesAccessedDaysAndSize.get(threshold); | ||
| if ((now - accessTime) >= threshold * ONE_DAY_MS) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here if the accessTime is very old, we will increase the sumSize for all the thresholds? Is this by design? Seems a bit strange...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 sir.
Yes, this is by design. For instance, if an HFile has not been accessed for 90 days, it is certain that it has also not been accessed for 30 and 7 days.
And what is your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so the metric here is 'the number if store files which are not accessed by at least xx days'? Then I think we;d better have a better name for it? Looking at 'storeFilesAccessedDays', I can not get the meaning unless you explain it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sir.
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| 🎊 +1 overall 
 
 This message was automatically generated. | 
| Any updates here? @2005hithlj Thanks. | 
| 
 @Apache9 Sorry sir, I've been a bit busy at work lately. I don't have extra bandwidth to do anything else at the moment. | 
| OK, then we can move it out of 3.0.0 first. Can come back later when you have time. | 
https://issues.apache.org/jira/browse/HBASE-27483