Per tier pool stats #70

byrnedj · 2023-03-17T12:11:03Z

This PR adds per tier statistics to the pool stats. We also introduce a new statistic called writebacks which counts the number of successfully moved items to the next tier.

Cachebench now outputs per tier hit ratios and per tier item counts so we can get a better idea of how much data is being served from each tier.

I have verified that there is no impact to performance with this PR.

This change is

vinser52 · 2023-03-17T14:52:02Z

cachelib/allocator/Cache.cpp

      statPrefix + "cache.size.configured",
      memStats.configuredRamCacheSize + memStats.nvmCacheSize);

+  //TODO: per-tier


I do not quite understand what this TODO means. std::accumulate does aggregation of per-tier stat, right?
Do you mean that in addition to accumulated stats we also should have per-tier counters?

vinser52 · 2023-03-17T16:21:25Z

cachelib/allocator/CacheAllocator-inl.h

+  PoolStats ret;
+  ret.isCompactCache = isCompactCache;
+  ret.poolName = allocator_[tid]->getPoolName(poolId);
+  ret.poolSize = pool.getPoolSize();


Do we want the size of a particular tier or the total pool size?

vinser52 · 2023-03-17T16:23:31Z

cachelib/allocator/CacheAllocator-inl.h

+        uint64_t fragmentationSize = 0;
+        for (TierId tid = 0; tid < getNumTiers(); tid++) {
+            fragmentationSize += (*stats_.fragmentationSize)[tid][pid][cid].get();
+        }


Minor: Would it be better to use std::accumulate as above in Cache.cpp?

vinser52 · 2023-03-20T11:39:28Z

Can we add some tests for per-tier stats?

igchor · 2023-03-20T18:53:24Z

In general looks good. Can you paste here some example output with those stats?

guptask · 2023-03-20T19:32:58Z

cachelib/allocator/Cache.cpp line 239 at r1 (raw file):

  //TODO: per-tier
  const auto stats = getGlobalCacheStats();

Does this change mean currently these starts are only aggregated for tier 0 ?

guptask · 2023-03-20T19:43:15Z

cachelib/allocator/CacheAllocator-inl.h line 2799 at r1 (raw file):

      MMContainerStat mmContainerStats;
      for (TierId tid = 0; tid < getNumTiers(); tid++) {
        allocAttempts += (*stats_.allocAttempts)[tid][poolId][cid].get();

Just wondering is it possible to implement a + operator override for these stats ? I'm thinking about the scenario where we add more stats - it would make the code cleaner since that operator would be within the class.

guptask · 2023-03-20T19:44:44Z

cachelib/allocator/CacheAllocator-inl.h line 2799 at r1 (raw file):

Previously, guptask (Sounak Gupta) wrote…

Just wondering is it possible to implement a + operator override for these stats ? I'm thinking about the scenario where we add more stats - it would make the code cleaner since that operator would be within the class.

Or maybe std::accumulate() ?

guptask · 2023-03-20T19:47:29Z

cachelib/allocator/CacheStats.h line 454 at r1 (raw file):

  // size of itemRemoved_ hash set in nvm
  uint64_t numNvmItemRemovedSetSize{0};

Minor: Instead of having vectors for all per-tier stats, I think it would be cleaner to encapsulate those as one class and a vector object for that class inside global cache stats.

guptask

Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @byrnedj and @igchor)

byrnedj

Reviewable status: 7 of 12 files reviewed, 5 unresolved discussions (waiting on @guptask, @igchor, and @vinser52)

cachelib/allocator/Cache.cpp line 238 at r1 (raw file):

Previously, vinser52 (Sergei Vinogradov) wrote…

I do not quite understand what this TODO means. std::accumulate does aggregation of per-tier stat, right?
Do you mean that in addition to accumulated stats we also should have per-tier counters?

Per tier counters for serialization yes. I will make the todo more clear.

cachelib/allocator/Cache.cpp line 239 at r1 (raw file):

Previously, guptask (Sounak Gupta) wrote…

Does this change mean currently these starts are only aggregated for tier 0 ?

Currently, only tier 0 stats are serialized. This change aggregates among the tiers and reports the totals since we have not implemented per-tier serialization.

cachelib/allocator/CacheAllocator-inl.h line 2799 at r1 (raw file):

Previously, guptask (Sounak Gupta) wrote…

Or maybe std::accumulate() ?

Stats can be easily added (if you see I added numWritebacks), the per-tier pool stats are all atomic or thread-local counters. I could encapsulate them but I think that would add additional overhead - without much additional benefit.

I could replace the for loop over each tier with std::accumulate - but its not entirely clean as the stats are per tier, per pool, per cid.

cachelib/allocator/CacheAllocator-inl.h line 2884 at r1 (raw file):

Previously, vinser52 (Sergei Vinogradov) wrote…

Do we want the size of a particular tier or the total pool size?

We want the total pool size for this particular pool on this particular tier (pid and tid are arguments to the function).

cachelib/allocator/CacheAllocator-inl.h line 3657 at r1 (raw file):

Previously, vinser52 (Sergei Vinogradov) wrote…

Minor: Would it be better to use std::accumulate as above in Cache.cpp?

I think in this case it is more clear to use the tier id index. In Cache.cpp those stats were 1d vectors and easy to use accumulate. The pool stat here is per pool, per class, per tier.

cachelib/allocator/CacheStats.h line 454 at r1 (raw file):

Previously, guptask (Sounak Gupta) wrote…

Minor: Instead of have vectors for all per-tier stats, I think it would be cleaner to encapsulate those as one class and a vector object for that class inside global cache stats.

Yes, it would make it explicit that those stats are per-tier in the global cache stats object. However, it would make the accumulation (in CacheStats.cpp) tricky as these stats are aggregated over all pools and classes. It doesn't appear to be as straightforward as I hoped.

byrnedj · 2023-03-23T00:20:33Z

Can we add some tests for per-tier stats?

I have added in my latest commit.

byrnedj · 2023-03-23T00:22:59Z

In general looks good. Can you paste here some example output with those stats?

vinser52

Reviewable status: 7 of 12 files reviewed, 4 unresolved discussions (waiting on @byrnedj, @guptask, and @igchor)

cachelib/allocator/Cache.cpp line 239 at r1 (raw file):

Previously, byrnedj (Daniel Byrne) wrote…

Currently, only tier 0 stats are serialized. This change aggregates among the tiers and reports the totals since we have not implemented per-tier serialization.

Why did you say that currently only tier 0 stats are serialized?
My understanding is that before this PR, the stats object contained global stats from all tiers. After your patch stats are captured on a per-tier basis and therefore here we need to call std::accumulate to get global stats again.

cachelib/allocator/CacheAllocator-inl.h line 2884 at r1 (raw file):

Previously, byrnedj (Daniel Byrne) wrote…

We want the total pool size for this particular pool on this particular tier (pid and tid are arguments to the function).

So in that particular line, we are getting the size of the pool in a particular tier? E.g. if the total pool size is 100 MB and 2 tiers are configured with equal ratios the ret.poolSize is 50MB in that case, am I correct?

cachelib/allocator/CacheAllocator-inl.h line 3657 at r1 (raw file):

Previously, byrnedj (Daniel Byrne) wrote…

I think in this case it is more clear to use the tier id index. In Cache.cpp those stats were 1d vectors and easy to use accumulate. The pool stat here is per pool, per class, per tier.

right, but we are accumulating only a single dimension. Anyway, as I said it is a minor comment and both ways result in the same.

byrnedj

Reviewable status: 7 of 12 files reviewed, 3 unresolved discussions (waiting on @guptask and @igchor)

cachelib/allocator/Cache.cpp line 239 at r1 (raw file):

Previously, vinser52 (Sergei Vinogradov) wrote…

Why did you say that currently only tier 0 stats are serialized?
My understanding is that before this PR, the stats object contained global stats from all tiers. After your patch stats are captured on a per-tier basis and therefore here we need to call std::accumulate to get global stats again.

Sorry I confused the global stats with the pool stats.

byrnedj

Reviewable status: 7 of 12 files reviewed, 3 unresolved discussions (waiting on @guptask, @igchor, and @vinser52)

cachelib/allocator/CacheAllocator-inl.h line 2884 at r1 (raw file):

Previously, vinser52 (Sergei Vinogradov) wrote…

So in that particular line, we are getting the size of the pool in a particular tier? E.g. if the total pool size is 100 MB and 2 tiers are configured with equal ratios the ret.poolSize is 50MB in that case, am I correct?

That is correct.

vinser52

Reviewable status: 7 of 12 files reviewed, 2 unresolved discussions (waiting on @guptask and @igchor)

cachelib/allocator/Cache.cpp line 239 at r1 (raw file):

Previously, byrnedj (Daniel Byrne) wrote…

Sorry I confused the global stats with the pool stats.

Yeah, I just wanted to make sure that my understanding is correct and I do not miss something.

igchor

LGTM

byrnedj requested review from guptask, igchor and vinser52 March 17, 2023 12:11

vinser52 reviewed Mar 17, 2023

View reviewed changes

guptask reviewed Mar 20, 2023

View reviewed changes

byrnedj force-pushed the PerTierPoolStats branch from 9e7d0af to 2923383 Compare March 23, 2023 00:16

initial revision of per tier stats

2923383

byrnedj commented Mar 23, 2023

View reviewed changes

vinser52 reviewed Mar 23, 2023

View reviewed changes

byrnedj commented Mar 23, 2023

View reviewed changes

vinser52 reviewed Mar 23, 2023

View reviewed changes

igchor reviewed Mar 23, 2023

View reviewed changes

byrnedj merged commit 773d548 into intel:develop Mar 23, 2023

vinser52 pushed a commit that referenced this pull request Jul 17, 2023

Per tier pool stats (#70)

96182ce

byrnedj added a commit that referenced this pull request Jul 23, 2023

Per tier pool stats (#70)

add2e5f

vinser52 pushed a commit that referenced this pull request Feb 29, 2024

Per tier pool stats (#70)

19742ff

vinser52 pushed a commit that referenced this pull request Mar 1, 2024

Per tier pool stats (#70)

bfe54ad

Per tier pool stats #70

Per tier pool stats #70

Uh oh!

Conversation

byrnedj commented Mar 17, 2023 • edited by rdower Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinser52 Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

vinser52 Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

vinser52 Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

vinser52 commented Mar 20, 2023

Uh oh!

igchor commented Mar 20, 2023

Uh oh!

guptask commented Mar 20, 2023

Uh oh!

guptask commented Mar 20, 2023

Uh oh!

guptask commented Mar 20, 2023

Uh oh!

guptask commented Mar 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

guptask left a comment

Choose a reason for hiding this comment

Uh oh!

byrnedj left a comment

Choose a reason for hiding this comment

Uh oh!

byrnedj commented Mar 23, 2023

Uh oh!

byrnedj commented Mar 23, 2023

Uh oh!

vinser52 left a comment

Choose a reason for hiding this comment

Uh oh!

byrnedj left a comment

Choose a reason for hiding this comment

Uh oh!

byrnedj left a comment

Choose a reason for hiding this comment

Uh oh!

vinser52 left a comment

Choose a reason for hiding this comment

Uh oh!

igchor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

byrnedj commented Mar 17, 2023 •

edited by rdower

Loading

guptask commented Mar 20, 2023 •

edited

Loading