-
Couldn't load subscription status.
- Fork 3.4k
HBASE-28468: Integrate the data-tiering logic into cache evictions. #5829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28468: Integrate the data-tiering logic into cache evictions. #5829
Conversation
| try { | ||
| freeInProgress = true; | ||
|
|
||
| // Fetch the file names from the backing map. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main logic is to rely on the file names. Fetch the file names from the block cache keys.
Pass the list to a new DataTieringManager API to detect which of these files are cold-files and return the list of cold files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the BlocksByFile instead of BackingMap if it seems more efficient.
2fa96c5 to
5998601
Compare
|
💔 -1 overall
This message was automatically generated. |
5998601 to
2f59db2
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
47ec22d to
836a70f
Compare
836a70f to
a357c63
Compare
|
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I like this approach better than the previous proposed in PR #5826. The only con is the overhead in figuring out the "cold" files within the eviction process. As discussed internally, we could add an extra global config for when data tiering manager should be considered.
Additional ask I have is to add a UT for this modified behaviour in freespace(). We should test that eviction correctly picks blocks for cold files but leave the hot ones.
| if (!maxTimestamp.isPresent()) { | ||
| // We could throw from here, But we are already in the critical code-path | ||
| // of freeing space. Hence, we can ignore that file for now | ||
| // Or do we want to include it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine for me. Can we add a debug/trace log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
| } | ||
| try { | ||
| freeInProgress = true; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please avoid unnecessary new lines.
| // List<BlockCacheKey> coldBlocks = new ArrayList<>(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please remove commented code.
| new BucketEntryGroup(bytesToFreeWithExtra, blockSize, getPartitionSize(memoryFactor)); | ||
|
|
||
| long bytesFreed = 0; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove new line.
| for (Map.Entry<BlockCacheKey, BucketEntry> bucketEntryWithKey : backingMap.entrySet()) { | ||
|
|
||
| if (coldFiles.containsKey(bucketEntryWithKey.getKey().getHfileName())) { | ||
| //coldBlocks.add(bucketEntryWithKey.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove commented code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
| if (coldFiles.containsKey(bucketEntryWithKey.getKey().getHfileName())) { | ||
| //coldBlocks.add(bucketEntryWithKey.getKey()); | ||
| bytesFreed += bucketEntryWithKey.getValue().getLength(); | ||
| evictBlock(bucketEntryWithKey.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are on an eviction process, this call would miss the proper stats counting. Also, we should care about ref counts, so better use evictBucketEntryIfNoRpcReferenced for this eviction.
Also here we are evicting all blocks for files considered cold. Shouldn't we do it only until we reach the bytesToFreeWithoutExtramark?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean we need to evict these cold file blocks even before the calculations for bytesToFreeWithoutExtra? That will require one additional traversal of backingMap. I will incorporate this change and we can take a look if that looks ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that not what I meant. You already have bytesToFreeWithExtra calculated in line #983 and you are computing bytesFreed. Once bytesFreed reached bytesToFreeWithExtra value, you still want to evict the cold files or should we just interrupt the loop?
|
|
||
| if (coldFiles.containsKey(bucketEntryWithKey.getKey().getHfileName())) { | ||
| //coldBlocks.add(bucketEntryWithKey.getKey()); | ||
| bytesFreed += bucketEntryWithKey.getValue().getLength(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only account for this once the eviction was indeed successfull.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
0d9e81d to
3a8c0dc
Compare
|
💔 -1 overall
This message was automatically generated. |
| if (coldFiles != null | ||
| && coldFiles.containsKey(bucketEntryWithKey.getKey().getHfileName())) { | ||
| int freedBlockSize = bucketEntryWithKey.getValue().getLength(); | ||
| evictBlockIfNoRpcReferenced(bucketEntryWithKey.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evictBlockIfNoRpcReferenced returns a boolean indicating whether the eviction was successful or not. We should check that boolean in orther to decide if we should increment bytesFreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
| if (coldFiles.containsKey(bucketEntryWithKey.getKey().getHfileName())) { | ||
| //coldBlocks.add(bucketEntryWithKey.getKey()); | ||
| bytesFreed += bucketEntryWithKey.getValue().getLength(); | ||
| evictBlock(bucketEntryWithKey.getKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that not what I meant. You already have bytesToFreeWithExtra calculated in line #983 and you are computing bytesFreed. Once bytesFreed reached bytesToFreeWithExtra value, you still want to evict the cold files or should we just interrupt the loop?
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/bucket/BucketCache.java
Show resolved
Hide resolved
| // Verify that the bucket cache contains 4 blocks. | ||
| assertEquals(3, bucketCache.getBackingMap().keySet().size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment mismatching the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
| while (!(bucketCache.getBackingMap().containsKey(key))) { | ||
| Thread.sleep(100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use Waiter.waitFor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
| while (!(bucketCache.getBackingMap().containsKey(newKey))) { | ||
| Thread.sleep(100); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use Waiter.waitFor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack!
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
79fc1ec to
0444e84
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
retest |
Adjust the unit test to incorporate the delay in the prefetch due to metadata access. Change-Id: I2fc25d564be0aa454921cafa5ce40d9df8135211
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…pache#5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…pache#5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…pache#5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…pache#5829) Signed-off-by: Wellington Chevreuil <[email protected]>
…5829) Signed-off-by: Wellington Chevreuil <[email protected]>
As a part of cache blocks eviction when the cache is full,
the data-tiering logic is integrated into the freeSpace code path
to identify the cold data files and evict the blocks associated with
those files.
The list of files is traversed to identify the cold files based
on the hot-data-age configuration and also the max timestamp associated
with the files.
The blocks associated with those cold files are evicted first and then
the existing logic of LFU blocks is executed to further evict the blocks.