-
Notifications
You must be signed in to change notification settings - Fork 4
[upstream] added ability for compressed pointer to use full 32 bits for addressing in single tier mode and use 31 bits for addressing in multi-tier mode #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 6 files reviewed, 1 unresolved discussion (waiting on @guptask)
cachelib/allocator/memory/CompressedPtr.h line 197 at r1 (raw file):
TierId tid; for (tid = 0; tid < allocators_.size(); tid++) { if (allocators_[tid]->isMemoryInAllocator(static_cast<const void*>(uncompressed)))
does it compile? on upstream, allocators_ is not a vector, but just a single pointer, right?
I think you could probably just hardcode isMultiTiered to false in this function for now.
3499c83 to
645c333
Compare
|
Previously, igchor (Igor Chorążewicz) wrote…
It compiles now. I renamed SingleTierPtrCompressor to PtrCompressor. We can revert to SingleTierPtrCompressor name later with the other changes. This is the least intrusive option in my opinion for merging into meta/main. I took out the AllocatorContainer based PtrCompressor for now. |
645c333 to
2dbf125
Compare
2dbf125 to
1f92aaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 6 files at r1, 3 of 4 files at r2, all commit messages.
Reviewable status: 6 of 7 files reviewed, 3 unresolved discussions (waiting on @guptask)
cachelib/allocator/memory/CompressedPtr.h line 197 at r1 (raw file):
Previously, guptask (Sounak Gupta) wrote…
It compiles now. I renamed SingleTierPtrCompressor to PtrCompressor. We can revert to SingleTierPtrCompressor name later with the other changes. This is the least intrusive option in my opinion for merging into meta/main. I took out the AllocatorContainer based PtrCompressor for now.
Ok, makes sense. I think you can still include the implementation which operates on vector of allocators. Just please makes sure it is not used anywhere (perhaps just add some simple test for it) and it has different name than PtrCompressor, e.g. MultiTierPtrCompressor or something like this. In the previous version, both classes were named the same and that the issue for me.
cachelib/allocator/memory/CompressedPtr.h line 41 at r2 (raw file):
// the 32nd bit only since its value cannot exceed kMaxTiers (2). This leaves // the remaining (32 - (kNumSlabBits - 6) - 1 bit for tier id) = 15 bits for // the slab index. Hence we can index 128 GiB of memory in slabs per tier and
128 GiB of memory per tier in mutlti-tier configuration or 256 GiB in single-tier
cachelib/allocator/memory/CompressedPtr.h line 118 at r2 (raw file):
// Number of bits for the slab index. This will be the top 16 bits of the // compressed ptr. static constexpr unsigned int kNumSlabIdxBits =
This change is a bit misleading I think. This value will represent number of bits used for slab indexing only for mutli-tier, for single tier it will be this value + 1, right? Maybe just replace this constant with a constexpr function?
static constexpr unsigned inte kNumSlabIdxBits(bool isMultiTiered) { return kNumTierIdxOffset - kNumAllocIdxBits + (!isMultiTiered); }
Then you wan't need to to add 1 in other places if isMultitTiered is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 7 files reviewed, 3 unresolved discussions (waiting on @guptask)
cachelib/allocator/memory/CompressedPtr.h line 197 at r1 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
Ok, makes sense. I think you can still include the implementation which operates on vector of allocators. Just please makes sure it is not used anywhere (perhaps just add some simple test for it) and it has different name than
PtrCompressor, e.g.MultiTierPtrCompressoror something like this. In the previous version, both classes were named the same and that the issue for me.
Or if you don;t include that implementation, please add some comment or some info in the commit message that it needs to be implemented later.
1f92aaf to
51d1590
Compare
|
Previously, igchor (Igor Chorążewicz) wrote…
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 7 files reviewed, 2 unresolved discussions (waiting on @guptask and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 41 at r2 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
128 GiB of memory per tier in mutlti-tier configuration or 256 GiB in single-tier
Done.
cachelib/allocator/memory/CompressedPtr.h line 118 at r2 (raw file):
Previously, igchor (Igor Chorążewicz) wrote…
This change is a bit misleading I think. This value will represent number of bits used for slab indexing only for mutli-tier, for single tier it will be this value + 1, right? Maybe just replace this constant with a constexpr function?
static constexpr unsigned inte kNumSlabIdxBits(bool isMultiTiered) { return kNumTierIdxOffset - kNumAllocIdxBits + (!isMultiTiered); }Then you wan't need to to add
1in other places if isMultitTiered isfalse.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 4 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @guptask)
51d1590 to
ad1a81f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, all discussions resolved (waiting on @guptask)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good - I just have one comment on the tests. I also was able to compile and pass all tests.
Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @guptask)
cachelib/allocator/memory/tests/MemoryAllocatorTest.cpp line 410 at r4 (raw file):
} ASSERT_EQ(nullptr, m.unCompress(m.compress(nullptr, false), false));
I would recommend adding the following block after this line. This way we can at least say the code is tested. In fact, I tried adding this and it passed.
Code snippet:
// and also test with multi-tier
for (const auto& pool : poolAllocs) {
const auto& allocs = pool.second;
for (const auto* alloc : allocs) {
CompressedPtr ptr = m.compress(alloc, true);
ASSERT_FALSE(ptr.isNull());
ASSERT_EQ(alloc, m.unCompress(ptr, true));
}
}
ASSERT_EQ(nullptr, m.unCompress(m.compress(nullptr, true), false));| // the slab using a 32 bit representation. | ||
| // | ||
| template <typename PtrType, typename AllocatorContainer> | ||
| class MultiTierPtrCompressor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this forward declaration?
| }; | ||
|
|
||
| template <typename PtrType, typename AllocatorContainer> | ||
| class MultiTierPtrCompressor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where this class template is used?
| static PtrType compress(uint32_t slabIdx, | ||
| uint32_t allocIdx, | ||
| bool isMultiTiered, | ||
| TierId tid) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where we are storing tid? I believe we need a unit test to check compress function
|
Previously, vinser52 (Sergei Vinogradov) wrote…
you're right. The forward declaration is needed because of the following in CompressedPtr class: template <typename PtrType, typename AllocatorContainer> But for upstream PR, I took it out. I'm adding it back. |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
It won't be used in the upstream PR. This is a foundational PR meant to add the framework. The MultiTierPtrCompressor will be used later. This is what Igor and I discussed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @guptask)
cachelib/allocator/memory/CompressedPtr.h line 190 at r4 (raw file):
Previously, guptask (Sounak Gupta) wrote…
It won't be used in the upstream PR. This is a foundational PR meant to add the framework. The MultiTierPtrCompressor will be used later. This is what Igor and I discussed.
If a class template is not used anywhere the compiler does not even compile it. It means you might even have a compilation error inside such a class template and the compiler will not detect it. Maybe better to upstream this template later when it is actually used? Or (if we want to upstream it in this PR) introduce a unit test for it so that it is compiled by the compiler.
|
Previously, byrnedj (Daniel Byrne) wrote…
Added |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
The CompressedPtr is backward compatible. The tid part is set to default 0 in CompressedPtr constructor. Also the following use case |
|
fix added. Thanks for posting the code stub Daniel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 8 files reviewed, 5 unresolved discussions (waiting on @byrnedj, @guptask, and @igchor)
cachelib/allocator/memory/tests/MemoryAllocatorTest.cpp line 418 at r5 (raw file):
MultiTierPtrCompressor ptr = m.compress(alloc, false); ASSERT_FALSE(ptr.isNull()); ASSERT_EQ(alloc, m.unCompress(ptr, false));
Can we add some Unit tests for multitier?
|
Previously, vinser52 (Sergei Vinogradov) wrote…
Done. |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
I took out MultiTierPtrCompressor for this PR. The difference between MultiTierPtrCompressor and PtrCompressor is that the 32nd bit can be reserved for tier id. In case of PtrCompressor we don't allow this 32nd bit to be set whereas in case of MultiTierPtrCompressor this 32nd bit is reserved for tier id. That's why the allocators.size() check is made to ensure whether we're dealing with single or multi-tier. In case of single tier, even if you use MultiTierPtrCompressor, it won't allow you to reserve the 32nd bit for tier id. This was done to create a common structure that'll allow all 32 bits in MultiTierPtrCompressor to be used for addressing when config is single tiered. |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
Taking out MultiTierPtrCompressor and I added a tier test based on Daniel's suggestion above. |
|
I took out MultiTierPtrCompressor for this PR. |
a6eb1e9 to
997faaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 5 unresolved discussions (waiting on @byrnedj and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 130 at r4 (raw file):
Previously, guptask (Sounak Gupta) wrote…
I took out MultiTierPtrCompressor for this PR. The difference between MultiTierPtrCompressor and PtrCompressor is that the 32nd bit can be reserved for tier id. In case of PtrCompressor we don't allow this 32nd bit to be set whereas in case of MultiTierPtrCompressor this 32nd bit is reserved for tier id. That's why the allocators.size() check is made to ensure whether we're dealing with single or multi-tier. In case of single tier, even if you use MultiTierPtrCompressor, it won't allow you to reserve the 32nd bit for tier id. This was done to create a common structure that'll allow all 32 bits in MultiTierPtrCompressor to be used for addressing when config is single tiered.
Why you added tid input parameter but it is not used anywhere in this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 6 unresolved discussions (waiting on @byrnedj, @guptask, and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 32 at r6 (raw file):
// This CompressedPtr makes decompression fast by staying away from division and // modulo arithmetic and doing those during the compression time. We most often // decompress a CompressedPtr than compress a pointer while creating one. This
3 extra whitespace should be removed.
|
Previously, vinser52 (Sergei Vinogradov) wrote…
This was meant to introduce the API change with backward compatibility. The final definition would look like what's the complete change in develop branch: // Compress the given slabIdx and allocIdx into a 64-bit compressed Do you think this API change introduction is better with rest of tid details (rather than a placeholder in this PR) ? |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
Done. |
997faaa to
bf8be62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 5 unresolved discussions (waiting on @byrnedj and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 130 at r4 (raw file):
Do you think this API change introduction is better with rest of tid details (rather than a placeholder in this PR) ?
Yes, I do think that full implementation like in develop branch is what we need in this PR. The current placeholder is not completed and I think it is error-prone because in the future we might forget to complete the implementation. And I see no barrier to putting complete implementation in this PR, am I wrong?
|
Previously, vinser52 (Sergei Vinogradov) wrote…
In https://github.com/intel/CacheLib/blob/develop/cachelib/allocator/memory/CompressedPtr.h There are two options the way I see it - Option 1. compressed pointer goes entirely with the multi-tier changes. That way the unit tests, that are in place in develop, can verify the compressed ptr part. The tid changes would be done everywhere; so compressed pointer code won't need any further change. Option 2. The way we're doing it now. Create an initial PR that partially introduces API related changes in Compressed Pointer while keeping backward compatibility in main branch. Then the multi-tier PR can include the develop branch compressed ptr commit. Since the expectation is that API changes would already be part of main branch by that point, only the additional changes in Compressed Ptr would need review. |
bf8be62 to
f1b532e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 6 unresolved discussions (waiting on @byrnedj, @guptask, and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 133 at r8 (raw file):
return (slabIdx << kNumAllocIdxBits) + allocIdx; } return (static_cast<uint64_t>(tid) << kNumTierIdxOffset) +
now the logic looks good, but another question should it really be static_cast<uint64_t>? Should it be a 32-bit value?
…ng in single tier mode and use 31 bits for addressing in multi-tier mode
f1b532e to
35acaea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 7 unresolved discussions (waiting on @byrnedj, @guptask, and @igchor)
cachelib/allocator/memory/CompressedPtr.h line 157 at r8 (raw file):
void setTierId(TierId tid) noexcept { ptr_ += static_cast<uint64_t>(tid) << kNumTierIdxOffset;
As we discussed, it also should be static_cast<32>
|
Previously, vinser52 (Sergei Vinogradov) wrote…
changed to 32 bits. |
|
Previously, vinser52 (Sergei Vinogradov) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 2 unresolved discussions (waiting on @byrnedj and @igchor)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 6 of 8 files reviewed, 2 unresolved discussions (waiting on @byrnedj and @igchor)
Disabled test suite allocator-test-AllocatorTypeTest to skip sporadically failing tests.
Disabled test suite allocator-test-AllocatorTypeTest to skip sporadically failing tests.
This change is