Skip to content

Conversation

@aaupov
Copy link
Contributor

@aaupov aaupov commented Jul 24, 2025

getFallthroughsInTrace requires CFG for functions not covered by BAT,
even in BAT/fdata mode. BAT-covered functions go through special
handling in fdata (BAT->getFallthroughsInTrace) and YAML
(DataAggregator::writeBATYAML) modes.

Since all modes (BAT/no-BAT, YAML/fdata) now need disassembly/CFG
construction:

  • drop special BAT/fdata handling that omitted disassembly/CFG in
    RewriteInstance::run, enabling CFG for all non-BAT functions,
  • switch getFallthroughsInTrace to check if a function has CFG,
  • which allows emitting profile for non-simple functions in all modes.

Previously, traces in non-simple functions were reported as invalid/
mismatching disassembled function contents. This change reduces the
number of such invalid traces and increases the number of profiled
functions. These functions may participate in function reordering via
call graph profile.

Test Plan: updated unclaimed-jt-entries.s

aaupov and others added 4 commits July 24, 2025 11:16
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
@aaupov aaupov marked this pull request as ready for review July 24, 2025 20:58
@llvmbot llvmbot added the BOLT label Jul 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 24, 2025

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

getFallthroughsInTrace for non-BAT functions requires CFG even for
fdata output only. Since all modes (BAT/non-BAT, YAML/fdata) need CFG:

  • drop special BAT/fdata handling in RewriteInstance::run,
  • in getFallthroughsInTrace, directly check if a function has CFG
    constructed => allow non-simple functions.

Test Plan: updated unclaimed-jt-entries.s


Full diff: https://github.com/llvm/llvm-project/pull/150488.diff

3 Files Affected:

  • (modified) bolt/lib/Profile/DataAggregator.cpp (+2-3)
  • (modified) bolt/lib/Rewrite/RewriteInstance.cpp (-15)
  • (modified) bolt/test/X86/unclaimed-jt-entries.s (+16-2)
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index e236cbab8fc87..f56f5963305c5 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -906,11 +906,10 @@ DataAggregator::getFallthroughsInTrace(BinaryFunction &BF, const Trace &Trace,
   if (BF.isPseudo())
     return Branches;
 
-  if (!BF.isSimple())
+  // Can only record traces in CFG state
+  if (!BF.hasCFG())
     return std::nullopt;
 
-  assert(BF.hasCFG() && "can only record traces in CFG state");
-
   const BinaryBasicBlock *FromBB = BF.getBasicBlockContainingOffset(From);
   const BinaryBasicBlock *ToBB = BF.getBasicBlockContainingOffset(To);
 
diff --git a/bolt/lib/Rewrite/RewriteInstance.cpp b/bolt/lib/Rewrite/RewriteInstance.cpp
index 7fbc933cade5b..7fac315ca9b27 100644
--- a/bolt/lib/Rewrite/RewriteInstance.cpp
+++ b/bolt/lib/Rewrite/RewriteInstance.cpp
@@ -714,21 +714,6 @@ Error RewriteInstance::run() {
 
   preprocessProfileData();
 
-  // Skip disassembling if we have a translation table and we are running an
-  // aggregation job.
-  if (opts::AggregateOnly && BAT->enabledFor(InputFile)) {
-    // YAML profile in BAT mode requires CFG for .bolt.org.text functions
-    if (!opts::SaveProfile.empty() ||
-        opts::ProfileFormat == opts::ProfileFormatKind::PF_YAML) {
-      selectFunctionsToProcess();
-      disassembleFunctions();
-      processMetadataPreCFG();
-      buildFunctionsCFG();
-    }
-    processProfileData();
-    return Error::success();
-  }
-
   selectFunctionsToProcess();
 
   readDebugInfo();
diff --git a/bolt/test/X86/unclaimed-jt-entries.s b/bolt/test/X86/unclaimed-jt-entries.s
index 1102e4ae413e2..31b72c47125ae 100644
--- a/bolt/test/X86/unclaimed-jt-entries.s
+++ b/bolt/test/X86/unclaimed-jt-entries.s
@@ -18,6 +18,18 @@
 
 # RUN: llvm-mc -filetype=obj -triple x86_64-unknown-unknown %s -o %t.o
 # RUN: %clang %cflags -no-pie %t.o -o %t.exe -Wl,-q
+
+## Check that non-simple function profile is emitted in perf2bolt mode
+# RUN: link_fdata %s %t.exe %t.pa PREAGG
+# RUN: llvm-strip -N L5 -N L5_ret %t.exe
+# RUN: perf2bolt %t.exe -p %t.pa --pa -o %t.fdata -strict=0 -print-profile \
+# RUN:   -print-only=main | FileCheck %s --check-prefix=CHECK-P2B
+# CHECK-P2B: PERF2BOLT: traces mismatching disassembled function contents: 0
+# CHECK-P2B: Binary Function "main"
+# CHECK-P2B: IsSimple : 0
+# RUN: FileCheck %s --input-file %t.fdata --check-prefix=CHECK-FDATA
+# CHECK-FDATA: 1 main 0 1 main 7 0 1
+
 # RUN: llvm-bolt %t.exe -v=1 -o %t.out 2>&1 | FileCheck %s
 
 # CHECK: BOLT-WARNING: unclaimed data to code reference (possibly an unrecognized jump table entry) to .Ltmp[[#]] in main
@@ -33,8 +45,10 @@
   .size main, .Lend-main
 main:
   jmp *L4-24(,%rdi,8)
-.L5:
+# PREAGG: T #main# #L5# #L5_ret# 1
+L5:
   movl $4, %eax
+L5_ret:
   ret
 .L9:
   movl $2, %eax
@@ -58,7 +72,7 @@ L4:
   .quad .L3
   .quad .L6
   .quad .L3
-  .quad .L5
+  .quad L5
   .quad .L3
   .quad .L3
   .quad .L3

lucas-rami and others added 2 commits July 25, 2025 04:38
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@aaupov aaupov changed the base branch from users/aaupov/spr/main.bolt-require-cfg-in-bat-mode to main July 25, 2025 11:38
@aaupov aaupov merged commit a850912 into main Jul 25, 2025
14 of 15 checks passed
@aaupov aaupov deleted the users/aaupov/spr/bolt-require-cfg-in-bat-mode branch July 25, 2025 11:54
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
`getFallthroughsInTrace` requires CFG for functions not covered by BAT,
even in BAT/fdata mode. BAT-covered functions go through special
handling in fdata (`BAT->getFallthroughsInTrace`) and YAML
(`DataAggregator::writeBATYAML`) modes.

Since all modes (BAT/no-BAT, YAML/fdata) now need disassembly/CFG
construction:
- drop special BAT/fdata handling that omitted disassembly/CFG in
  `RewriteInstance::run`, enabling *CFG for all non-BAT functions*,
- switch `getFallthroughsInTrace` to check if a function has CFG,
- which *allows emitting profile for non-simple functions* in all modes.

Previously, traces in non-simple functions were reported as invalid/
mismatching disassembled function contents. This change reduces the
number of such invalid traces and increases the number of profiled
functions. These functions may participate in function reordering via
call graph profile.

Test Plan: updated unclaimed-jt-entries.s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants