Skip to content

Conversation

kazutakahirata
Copy link
Contributor

This patch creates a helper function named handleAllocSite to handle
the allocation site. It makes readMemProf a little bit shorter.

I'm planning to move the code to handle call sites in a subsequent
patch. Doing so in this patch would make this patch a lot longer
because we need to move other things like CallSiteEntry and
CallSiteEntryHash.

This patch creates a helper function named handleAllocSite to handle
the allocation site.  It makes readMemProf a little bit shorter.

I'm planning to move the code to handle call sites in a subsequent
patch.  Doing so in this patch would make this patch a lot longer
because we need to move other things like CallSiteEntry and
CallSiteEntryHash.
@llvmbot
Copy link
Member

llvmbot commented Jul 19, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Kazu Hirata (kazutakahirata)

Changes

This patch creates a helper function named handleAllocSite to handle
the allocation site. It makes readMemProf a little bit shorter.

I'm planning to move the code to handle call sites in a subsequent
patch. Doing so in this patch would make this patch a lot longer
because we need to move other things like CallSiteEntry and
CallSiteEntryHash.


Full diff: https://github.com/llvm/llvm-project/pull/149663.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/MemProfUse.cpp (+70-60)
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
index e5b357fc1bfbf..2a8416d02ffba 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
@@ -361,6 +361,74 @@ static void addVPMetadata(Module &M, Instruction &I,
   }
 }
 
+static void
+handleAllocSite(Instruction &I, CallBase *CI,
+                ArrayRef<uint64_t> InlinedCallStack, LLVMContext &Ctx,
+                OptimizationRemarkEmitter &ORE, uint64_t MaxColdSize,
+                const std::set<const AllocationInfo *> &AllocInfoSet,
+                std::map<std::pair<uint64_t, unsigned>, AllocMatchInfo>
+                    &FullStackIdToAllocMatchInfo) {
+  // We may match this instruction's location list to multiple MIB
+  // contexts. Add them to a Trie specialized for trimming the contexts to
+  // the minimal needed to disambiguate contexts with unique behavior.
+  CallStackTrie AllocTrie(&ORE, MaxColdSize);
+  uint64_t TotalSize = 0;
+  uint64_t TotalColdSize = 0;
+  for (auto *AllocInfo : AllocInfoSet) {
+    // Check the full inlined call stack against this one.
+    // If we found and thus matched all frames on the call, include
+    // this MIB.
+    if (stackFrameIncludesInlinedCallStack(AllocInfo->CallStack,
+                                           InlinedCallStack)) {
+      NumOfMemProfMatchedAllocContexts++;
+      uint64_t FullStackId = 0;
+      if (ClPrintMemProfMatchInfo || recordContextSizeInfoForAnalysis())
+        FullStackId = computeFullStackId(AllocInfo->CallStack);
+      auto AllocType = addCallStack(AllocTrie, AllocInfo, FullStackId);
+      TotalSize += AllocInfo->Info.getTotalSize();
+      if (AllocType == AllocationType::Cold)
+        TotalColdSize += AllocInfo->Info.getTotalSize();
+      // Record information about the allocation if match info printing
+      // was requested.
+      if (ClPrintMemProfMatchInfo) {
+        assert(FullStackId != 0);
+        FullStackIdToAllocMatchInfo[std::make_pair(FullStackId,
+                                                   InlinedCallStack.size())] = {
+            AllocInfo->Info.getTotalSize(), AllocType};
+      }
+    }
+  }
+  // If the threshold for the percent of cold bytes is less than 100%,
+  // and not all bytes are cold, see if we should still hint this
+  // allocation as cold without context sensitivity.
+  if (TotalColdSize < TotalSize && MinMatchedColdBytePercent < 100 &&
+      TotalColdSize * 100 >= MinMatchedColdBytePercent * TotalSize) {
+    AllocTrie.addSingleAllocTypeAttribute(CI, AllocationType::Cold, "dominant");
+    return;
+  }
+
+  // We might not have matched any to the full inlined call stack.
+  // But if we did, create and attach metadata, or a function attribute if
+  // all contexts have identical profiled behavior.
+  if (!AllocTrie.empty()) {
+    NumOfMemProfMatchedAllocs++;
+    // MemprofMDAttached will be false if a function attribute was
+    // attached.
+    bool MemprofMDAttached = AllocTrie.buildAndAttachMIBMetadata(CI);
+    assert(MemprofMDAttached == I.hasMetadata(LLVMContext::MD_memprof));
+    if (MemprofMDAttached) {
+      // Add callsite metadata for the instruction's location list so that
+      // it simpler later on to identify which part of the MIB contexts
+      // are from this particular instruction (including during inlining,
+      // when the callsite metadata will be updated appropriately).
+      // FIXME: can this be changed to strip out the matching stack
+      // context ids from the MIB contexts and not add any callsite
+      // metadata here to save space?
+      addCallsiteMetadata(I, InlinedCallStack, Ctx);
+    }
+  }
+}
+
 static void readMemprof(Module &M, Function &F,
                         IndexedInstrProfReader *MemProfReader,
                         const TargetLibraryInfo &TLI,
@@ -554,66 +622,8 @@ static void readMemprof(Module &M, Function &F,
       if (AllocInfoIter != LocHashToAllocInfo.end() &&
           // Only consider allocations which support hinting.
           isAllocationWithHotColdVariant(CI->getCalledFunction(), TLI)) {
-        // We may match this instruction's location list to multiple MIB
-        // contexts. Add them to a Trie specialized for trimming the contexts to
-        // the minimal needed to disambiguate contexts with unique behavior.
-        CallStackTrie AllocTrie(&ORE, MaxColdSize);
-        uint64_t TotalSize = 0;
-        uint64_t TotalColdSize = 0;
-        for (auto *AllocInfo : AllocInfoIter->second) {
-          // Check the full inlined call stack against this one.
-          // If we found and thus matched all frames on the call, include
-          // this MIB.
-          if (stackFrameIncludesInlinedCallStack(AllocInfo->CallStack,
-                                                 InlinedCallStack)) {
-            NumOfMemProfMatchedAllocContexts++;
-            uint64_t FullStackId = 0;
-            if (ClPrintMemProfMatchInfo || recordContextSizeInfoForAnalysis())
-              FullStackId = computeFullStackId(AllocInfo->CallStack);
-            auto AllocType = addCallStack(AllocTrie, AllocInfo, FullStackId);
-            TotalSize += AllocInfo->Info.getTotalSize();
-            if (AllocType == AllocationType::Cold)
-              TotalColdSize += AllocInfo->Info.getTotalSize();
-            // Record information about the allocation if match info printing
-            // was requested.
-            if (ClPrintMemProfMatchInfo) {
-              assert(FullStackId != 0);
-              FullStackIdToAllocMatchInfo[std::make_pair(
-                  FullStackId, InlinedCallStack.size())] = {
-                  AllocInfo->Info.getTotalSize(), AllocType};
-            }
-          }
-        }
-        // If the threshold for the percent of cold bytes is less than 100%,
-        // and not all bytes are cold, see if we should still hint this
-        // allocation as cold without context sensitivity.
-        if (TotalColdSize < TotalSize && MinMatchedColdBytePercent < 100 &&
-            TotalColdSize * 100 >= MinMatchedColdBytePercent * TotalSize) {
-          AllocTrie.addSingleAllocTypeAttribute(CI, AllocationType::Cold,
-                                                "dominant");
-          continue;
-        }
-
-        // We might not have matched any to the full inlined call stack.
-        // But if we did, create and attach metadata, or a function attribute if
-        // all contexts have identical profiled behavior.
-        if (!AllocTrie.empty()) {
-          NumOfMemProfMatchedAllocs++;
-          // MemprofMDAttached will be false if a function attribute was
-          // attached.
-          bool MemprofMDAttached = AllocTrie.buildAndAttachMIBMetadata(CI);
-          assert(MemprofMDAttached == I.hasMetadata(LLVMContext::MD_memprof));
-          if (MemprofMDAttached) {
-            // Add callsite metadata for the instruction's location list so that
-            // it simpler later on to identify which part of the MIB contexts
-            // are from this particular instruction (including during inlining,
-            // when the callsite metadata will be updated appropriately).
-            // FIXME: can this be changed to strip out the matching stack
-            // context ids from the MIB contexts and not add any callsite
-            // metadata here to save space?
-            addCallsiteMetadata(I, InlinedCallStack, Ctx);
-          }
-        }
+        handleAllocSite(I, CI, InlinedCallStack, Ctx, ORE, MaxColdSize,
+                        AllocInfoIter->second, FullStackIdToAllocMatchInfo);
         continue;
       }
 

Copy link

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@kazutakahirata kazutakahirata merged commit 04f2114 into llvm:main Jul 20, 2025
11 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_20250719_memprof_refactor branch July 20, 2025 15:28
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 20, 2025

LLVM Buildbot has detected a new failure on builder clang-s390x-linux-lnt running on systemz-1 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/136/builds/4654

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'libFuzzer-s390x-default-Linux :: fuzzer-timeout.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest # RUN: at line 1
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/TimeoutTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest
/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/./bin/clang    -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/lib/fuzzer  /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest # RUN: at line 2
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/./bin/clang -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta --driver-mode=g++ -O2 -gline-tables-only -fsanitize=address,fuzzer -I/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/lib/fuzzer /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/TimeoutEmptyTest.cpp -o /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutEmptyTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest # RUN: at line 3
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=TimeoutTest
not  /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/hi.txt 2>&1 | FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest # RUN: at line 12
+ FileCheck /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/fuzzer-timeout.test --check-prefix=SingleInputTimeoutTest
+ not /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/llvm/compiler-rt/test/fuzzer/hi.txt
/home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0 # RUN: at line 16
+ /home/uweigand/sandbox/buildbot/clang-s390x-linux-lnt/stage1/runtimes/runtimes-bins/compiler-rt/test/fuzzer/S390XDefaultLinuxConfig/Output/fuzzer-timeout.test.tmp-TimeoutTest -timeout=1 -timeout_exitcode=0
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3784540391
INFO: Loaded 1 modules   (13 inline 8-bit counters): 13 [0x2aa0f4d0e80, 0x2aa0f4d0e8d), 
INFO: Loaded 1 PC tables (13 PCs): 13 [0x2aa0f4d0e90,0x2aa0f4d0f60), 
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
INFO: A corpus is not provided, starting from an empty corpus
#2	INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 31Mb
#208	NEW    cov: 3 ft: 3 corp: 2/2b lim: 6 exec/s: 0 rss: 31Mb L: 1/1 MS: 1 ChangeByte-
#220	NEW    cov: 4 ft: 4 corp: 3/8b lim: 6 exec/s: 0 rss: 32Mb L: 6/6 MS: 2 CopyPart-InsertRepeatedBytes-
#287	REDUCE cov: 4 ft: 4 corp: 3/6b lim: 6 exec/s: 0 rss: 32Mb L: 4/4 MS: 2 CopyPart-EraseBytes-
#306	REDUCE cov: 4 ft: 4 corp: 3/5b lim: 6 exec/s: 0 rss: 32Mb L: 3/3 MS: 4 ChangeByte-ShuffleBytes-InsertByte-EraseBytes-
#317	REDUCE cov: 4 ft: 4 corp: 3/4b lim: 6 exec/s: 0 rss: 32Mb L: 2/2 MS: 1 EraseBytes-
#7159	REDUCE cov: 5 ft: 5 corp: 4/6b lim: 68 exec/s: 0 rss: 32Mb L: 2/2 MS: 2 ChangeBinInt-ChangeByte-
#7276	REDUCE cov: 6 ft: 6 corp: 5/9b lim: 68 exec/s: 0 rss: 32Mb L: 3/3 MS: 2 CrossOver-InsertByte-
ALARM: working on the last Unit for 1 seconds
       and the timeout value is 1 (use -timeout=N to change)
MS: 2 InsertByte-EraseBytes-; base unit: 57f5a44fdd9cf767b127bf151b0b86e0e73505fe
0x48,0x69,0x21,
Hi!
artifact_prefix='./'; Test unit written to ./timeout-c0a0ad26a634840c67a210fefdda76577b03a111
Base64: SGkh
==2911608== ERROR: libFuzzer: timeout after 1 seconds
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer:DEADLYSIGNAL
=================================================================
AddressSanitizer: CHECK failed: asan_report.cpp:227 "((current_error_.kind)) == ((kErrorKindInvalid))" (0x1, 0x0) (tid=2911608)
    <empty stack>

MS: 2 InsertByte-EraseBytes-; base unit: 57f5a44fdd9cf767b127bf151b0b86e0e73505fe
0x48,0x69,0x21,
Hi!
artifact_prefix='./'; Test unit written to ./crash-c0a0ad26a634840c67a210fefdda76577b03a111
...

mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
This patch creates a helper function named handleAllocSite to handle
the allocation site.  It makes readMemProf a little bit shorter.

I'm planning to move the code to handle call sites in a subsequent
patch.  Doing so in this patch would make this patch a lot longer
because we need to move other things like CallSiteEntry and
CallSiteEntryHash.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants