-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LLVM][PassBuilder] Extend the function signature of callback for optimizer pipeline extension point #100953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-amdgpu Author: Shilei Tian (shiltian) ChangesThese callbacks can be invoked in multiple places when building an optimization In this patch, an extra argument is added to indicate its (Thin)LTO stage such Full diff: https://github.com/llvm/llvm-project/pull/100953.diff 7 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index e765bbf637a66..64f0020a170aa 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -643,7 +643,7 @@ static void addKCFIPass(const Triple &TargetTriple, const LangOptions &LangOpts,
// Ensure we lower KCFI operand bundles with -O0.
PB.registerOptimizerLastEPCallback(
- [&](ModulePassManager &MPM, OptimizationLevel Level) {
+ [&](ModulePassManager &MPM, OptimizationLevel Level, ThinOrFullLTOPhase) {
if (Level == OptimizationLevel::O0 &&
LangOpts.Sanitize.has(SanitizerKind::KCFI))
MPM.addPass(createModuleToFunctionPassAdaptor(KCFIPass()));
@@ -662,8 +662,8 @@ static void addKCFIPass(const Triple &TargetTriple, const LangOptions &LangOpts,
static void addSanitizers(const Triple &TargetTriple,
const CodeGenOptions &CodeGenOpts,
const LangOptions &LangOpts, PassBuilder &PB) {
- auto SanitizersCallback = [&](ModulePassManager &MPM,
- OptimizationLevel Level) {
+ auto SanitizersCallback = [&](ModulePassManager &MPM, OptimizationLevel Level,
+ ThinOrFullLTOPhase) {
if (CodeGenOpts.hasSanitizeCoverage()) {
auto SancovOpts = getSancovOptsFromCGOpts(CodeGenOpts);
MPM.addPass(SanitizerCoveragePass(
@@ -749,7 +749,7 @@ static void addSanitizers(const Triple &TargetTriple,
PB.registerOptimizerEarlyEPCallback(
[SanitizersCallback](ModulePassManager &MPM, OptimizationLevel Level) {
ModulePassManager NewMPM;
- SanitizersCallback(NewMPM, Level);
+ SanitizersCallback(NewMPM, Level, ThinOrFullLTOPhase::None);
if (!NewMPM.isEmpty()) {
// Sanitizers can abandon<GlobalsAA>.
NewMPM.addPass(RequireAnalysisPass<GlobalsAA, llvm::Module>());
@@ -1018,11 +1018,12 @@ void EmitAssemblyHelper::RunOptimizationPipeline(
// TODO: Consider passing the MemoryProfileOutput to the pass builder via
// the PGOOptions, and set this up there.
if (!CodeGenOpts.MemoryProfileOutput.empty()) {
- PB.registerOptimizerLastEPCallback(
- [](ModulePassManager &MPM, OptimizationLevel Level) {
- MPM.addPass(createModuleToFunctionPassAdaptor(MemProfilerPass()));
- MPM.addPass(ModuleMemProfilerPass());
- });
+ PB.registerOptimizerLastEPCallback([](ModulePassManager &MPM,
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase) {
+ MPM.addPass(createModuleToFunctionPassAdaptor(MemProfilerPass()));
+ MPM.addPass(ModuleMemProfilerPass());
+ });
}
if (CodeGenOpts.FatLTO) {
diff --git a/llvm/include/llvm/Passes/PassBuilder.h b/llvm/include/llvm/Passes/PassBuilder.h
index 474a19531ff5d..4c2763404ff05 100644
--- a/llvm/include/llvm/Passes/PassBuilder.h
+++ b/llvm/include/llvm/Passes/PassBuilder.h
@@ -497,7 +497,8 @@ class PassBuilder {
/// This extension point allows adding optimizations at the very end of the
/// function optimization pipeline.
void registerOptimizerLastEPCallback(
- const std::function<void(ModulePassManager &, OptimizationLevel)> &C) {
+ const std::function<void(ModulePassManager &, OptimizationLevel,
+ ThinOrFullLTOPhase)> &C) {
OptimizerLastEPCallbacks.push_back(C);
}
@@ -630,7 +631,8 @@ class PassBuilder {
void invokeOptimizerEarlyEPCallbacks(ModulePassManager &MPM,
OptimizationLevel Level);
void invokeOptimizerLastEPCallbacks(ModulePassManager &MPM,
- OptimizationLevel Level);
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase);
void invokeFullLinkTimeOptimizationEarlyEPCallbacks(ModulePassManager &MPM,
OptimizationLevel Level);
void invokeFullLinkTimeOptimizationLastEPCallbacks(ModulePassManager &MPM,
@@ -755,7 +757,9 @@ class PassBuilder {
// Module callbacks
SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
OptimizerEarlyEPCallbacks;
- SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
+ SmallVector<std::function<void(ModulePassManager &, OptimizationLevel,
+ ThinOrFullLTOPhase)>,
+ 2>
OptimizerLastEPCallbacks;
SmallVector<std::function<void(ModulePassManager &, OptimizationLevel)>, 2>
FullLinkTimeOptimizationEarlyEPCallbacks;
diff --git a/llvm/lib/Passes/PassBuilderPipelines.cpp b/llvm/lib/Passes/PassBuilderPipelines.cpp
index 757b20dcd6693..5827a485879d3 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -365,9 +365,10 @@ void PassBuilder::invokeOptimizerEarlyEPCallbacks(ModulePassManager &MPM,
C(MPM, Level);
}
void PassBuilder::invokeOptimizerLastEPCallbacks(ModulePassManager &MPM,
- OptimizationLevel Level) {
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase) {
for (auto &C : OptimizerLastEPCallbacks)
- C(MPM, Level);
+ C(MPM, Level, Phase);
}
void PassBuilder::invokeFullLinkTimeOptimizationEarlyEPCallbacks(
ModulePassManager &MPM, OptimizationLevel Level) {
@@ -1524,7 +1525,7 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
MPM.addPass(createModuleToFunctionPassAdaptor(std::move(OptimizePM),
PTO.EagerlyInvalidateAnalyses));
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level, LTOPhase);
// Split out cold code. Splitting is done late to avoid hiding context from
// other optimizations and inadvertently regressing performance. The tradeoff
@@ -1671,7 +1672,8 @@ PassBuilder::buildThinLTOPreLinkDefaultPipeline(OptimizationLevel Level) {
// optimization is going to be done in PostLink stage, but clang can't add
// callbacks there in case of in-process ThinLTO called by linker.
invokeOptimizerEarlyEPCallbacks(MPM, Level);
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level,
+ ThinOrFullLTOPhase::ThinLTOPreLink);
// Emit annotation remarks.
addAnnotationRemarksPass(MPM);
@@ -2159,7 +2161,7 @@ ModulePassManager PassBuilder::buildO0DefaultPipeline(OptimizationLevel Level,
CoroPM.addPass(GlobalDCEPass());
MPM.addPass(CoroConditionalWrapper(std::move(CoroPM)));
- invokeOptimizerLastEPCallbacks(MPM, Level);
+ invokeOptimizerLastEPCallbacks(MPM, Level, ThinOrFullLTOPhase::None);
if (LTOPreLink)
addRequiredLTOPreLinkPasses(MPM);
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 46cc5f349555a..50aef36724f70 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -287,8 +287,13 @@ class AMDGPUAttributorPass : public PassInfoMixin<AMDGPUAttributorPass> {
private:
TargetMachine &TM;
+ /// Asserts whether we can assume whole program visibility.
+ bool HasWholeProgramVisibility = false;
+
public:
- AMDGPUAttributorPass(TargetMachine &TM) : TM(TM){};
+ AMDGPUAttributorPass(TargetMachine &TM,
+ bool HasWholeProgramVisibility = false)
+ : TM(TM), HasWholeProgramVisibility(HasWholeProgramVisibility) {};
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
};
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
index 51968063e8919..ab98da31b050f 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAttributor.cpp
@@ -1023,7 +1023,8 @@ static void addPreloadKernArgHint(Function &F, TargetMachine &TM) {
}
}
-static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM) {
+static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM,
+ bool HasWholeProgramVisibility) {
SetVector<Function *> Functions;
for (Function &F : M) {
if (!F.isIntrinsic())
@@ -1041,6 +1042,7 @@ static bool runImpl(Module &M, AnalysisGetter &AG, TargetMachine &TM) {
&AAUnderlyingObjects::ID, &AAIndirectCallInfo::ID});
AttributorConfig AC(CGUpdater);
+ AC.IsClosedWorldModule = HasWholeProgramVisibility;
AC.Allowed = &Allowed;
AC.IsModulePass = true;
AC.DefaultInitializeLiveInternals = false;
@@ -1086,7 +1088,7 @@ class AMDGPUAttributorLegacy : public ModulePass {
bool runOnModule(Module &M) override {
AnalysisGetter AG(this);
- return runImpl(M, AG, *TM);
+ return runImpl(M, AG, *TM, /*HasWholeProgramVisibility=*/false);
}
void getAnalysisUsage(AnalysisUsage &AU) const override {
@@ -1107,8 +1109,9 @@ PreservedAnalyses llvm::AMDGPUAttributorPass::run(Module &M,
AnalysisGetter AG(FAM);
// TODO: Probably preserves CFG
- return runImpl(M, AG, TM) ? PreservedAnalyses::none()
- : PreservedAnalyses::all();
+ return runImpl(M, AG, TM, HasWholeProgramVisibility)
+ ? PreservedAnalyses::none()
+ : PreservedAnalyses::all();
}
char AMDGPUAttributorLegacy::ID = 0;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index c8fb68d1c0b0c..50cc2d871d4ec 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -735,12 +735,15 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
});
// FIXME: Why is AMDGPUAttributor not in CGSCC?
- PB.registerOptimizerLastEPCallback(
- [this](ModulePassManager &MPM, OptimizationLevel Level) {
- if (Level != OptimizationLevel::O0) {
- MPM.addPass(AMDGPUAttributorPass(*this));
- }
- });
+ PB.registerOptimizerLastEPCallback([this](ModulePassManager &MPM,
+ OptimizationLevel Level,
+ ThinOrFullLTOPhase Phase) {
+ if (Level != OptimizationLevel::O0) {
+ MPM.addPass(AMDGPUAttributorPass(
+ *this, Phase == ThinOrFullLTOPhase::FullLTOPostLink ||
+ Phase == ThinOrFullLTOPhase::ThinLTOPostLink));
+ }
+ });
PB.registerFullLinkTimeOptimizationLastEPCallback(
[this](ModulePassManager &PM, OptimizationLevel Level) {
diff --git a/llvm/tools/opt/NewPMDriver.cpp b/llvm/tools/opt/NewPMDriver.cpp
index 374698083763b..522a8c06d83c0 100644
--- a/llvm/tools/opt/NewPMDriver.cpp
+++ b/llvm/tools/opt/NewPMDriver.cpp
@@ -310,7 +310,7 @@ static void registerEPCallbacks(PassBuilder &PB) {
});
if (tryParsePipelineText<ModulePassManager>(PB, OptimizerLastEPPipeline))
PB.registerOptimizerLastEPCallback(
- [&PB](ModulePassManager &PM, OptimizationLevel) {
+ [&PB](ModulePassManager &PM, OptimizationLevel, ThinOrFullLTOPhase) {
ExitOnError Err("Unable to parse OptimizerLastEP pipeline: ");
Err(PB.parsePassPipeline(PM, OptimizerLastEPPipeline));
});
|
|
I thought I tried this myself. Does it really work, as in, is the mode set properly for full and thin lto? I think I had problems for one of them. |
nikic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine to me in general. The patch stack seems to be messed up though, or at least this seems to contain some unrelated AMDGPU changes.
@jdoerfert Possibly the issue you saw is that this callback just isn't used by the post-link full LTO pipeline at all? It uses invokeFullLinkTimeOptimizationLastEPCallbacks instead. Maybe with this change it would make sense to remove that one and merge it into the main OptimizerLastEPCallback with the appropriate LTOPhase?
The other thing I wonder about is whether this argument should be added to other callbacks as well for consistency.
It has some AMD changes because I'd like to demonstrate how the changes will be used.
Yeah, I was wondering that as well. I'm happy to do the changes but not sure if that's necessary. |
76c0d45 to
f4f6b93
Compare
9c94910 to
ed46483
Compare
2bbf195 to
9b12959
Compare
ed46483 to
9980c1f
Compare
ec07983 to
76e5c51
Compare
9980c1f to
6e26b39
Compare
76e5c51 to
aa74047
Compare
6e26b39 to
8df9dec
Compare
aa74047 to
52df8e6
Compare
8df9dec to
913f6a7
Compare
|
ping if it is preferred to split the AMDGPU related changes to another PR, I can do that. |
52df8e6 to
5da7378
Compare
913f6a7 to
842d222
Compare
5da7378 to
5d7487b
Compare
842d222 to
225eca0
Compare
5d7487b to
ea3f6bb
Compare
225eca0 to
699c144
Compare
699c144 to
602fcdf
Compare
|
ping +1 |
|
This PR will not be needed if we decide to move |
…imizer pipeline extension point These callbacks can be invoked in multiple places when building an optimization pipeline, both in compile time and link time. However, there is no indicator on what pipeline it is currently building. In this patch, an extra argument is added to indicate its (Thin)LTO stage such that the callback can check it if needed. There is no test expected from this, and the benefit of this change will be demonstrated in #66488.
602fcdf to
0110da5
Compare
|
Split it, and maybe add the Phase to the rest as well. One commit to just make it consistent and pass the information along.
The problem is that the OptimizerLastEPCallback is not called in FullLTO post link, IIRC. So this will alone not solve our issue. However, we still should tell the callbacks what phase we are in, consistently. |
|
This PR is no longer needed. We decided to move the |

These callbacks can be invoked in multiple places when building an optimization
pipeline, both in compile time and link time. However, there is no indicator on
what pipeline it is currently building.
In this patch, an extra argument is added to indicate its (Thin)LTO stage such
that the callback can check it if needed. There is no test expected from this.