-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[SimplifyCFG] Eliminate dead edges of switches according to the domain of conditions #165748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
efe45cf to
e47c26e
Compare
|
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-hexagon Author: Yingwei Zheng (dtcxzyw) ChangesIn simplifycfg/cvp/sccp, we eliminate dead edges of switches according to the knownbits/range info of conditions. However, these approximations may not meet the real-world needs when the domain of condition values is sparse. For example, if the condition can only be either -3 or 3, we cannot prove that the condition never evaluates to 1 (knownbits: ???????1, range: [-3, 4)). Note: In https://discourse.llvm.org/t/missed-optimization-due-to-overflow-check/88700 I proposed a new value lattice kind to represent such values. But I find it hard to apply because the transition becomes much complicated. Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=32d6b2139a6c8f79e074e8c6cfe0cc9e79c4c0c8&to=e47c26e3f1bf9eb062684dda4fafce58438e994b&stat=instructions:u Full diff: https://github.com/llvm/llvm-project/pull/165748.diff 8 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index af218ba564081..093309cb8bbee 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -1024,6 +1024,16 @@ findValuesAffectedByCondition(Value *Cond, bool IsAssume,
LLVM_ABI Value *stripNullTest(Value *V);
LLVM_ABI const Value *stripNullTest(const Value *V);
+/// Enumerates all possible values of V and inserts them into the set \p
+/// Constants. If \p AllowUndefOrPoison is false, it fails when V may contain
+/// undef/poison elements. Returns true if the result is complete. Otherwise,
+/// the result is incomplete (more than MaxCount values).
+/// NOTE: The constant values are not distinct.
+LLVM_ABI bool
+collectPossibleValues(const Value *V,
+ SmallPtrSetImpl<const Constant *> &Constants,
+ unsigned MaxCount, bool AllowUndefOrPoison = true);
+
} // end namespace llvm
#endif // LLVM_ANALYSIS_VALUETRACKING_H
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 0a72076f51824..a289e9bf14e2e 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -10405,3 +10405,55 @@ const Value *llvm::stripNullTest(const Value *V) {
Value *llvm::stripNullTest(Value *V) {
return const_cast<Value *>(stripNullTest(const_cast<const Value *>(V)));
}
+
+bool llvm::collectPossibleValues(const Value *V,
+ SmallPtrSetImpl<const Constant *> &Constants,
+ unsigned MaxCount, bool AllowUndefOrPoison) {
+ SmallPtrSet<const Instruction *, 8> Visited;
+ SmallVector<const Instruction *, 8> Worklist;
+ auto Push = [&](const Value *V) -> bool {
+ if (auto *C = dyn_cast<Constant>(V)) {
+ if (!AllowUndefOrPoison && !isGuaranteedNotToBeUndefOrPoison(C))
+ return false;
+ // Check existence first to avoid unnecessary allocations.
+ if (Constants.contains(C))
+ return true;
+ if (Constants.size() == MaxCount)
+ return false;
+ Constants.insert(C);
+ return true;
+ }
+
+ if (auto *Inst = dyn_cast<Instruction>(V)) {
+ if (Visited.insert(Inst).second)
+ Worklist.push_back(Inst);
+ return true;
+ }
+ return false;
+ };
+ if (!Push(V))
+ return false;
+ while (!Worklist.empty()) {
+ const Instruction *CurInst = Worklist.pop_back_val();
+ switch (CurInst->getOpcode()) {
+ case Instruction::Select:
+ if (!Push(CurInst->getOperand(1)))
+ return false;
+ if (!Push(CurInst->getOperand(2)))
+ return false;
+ break;
+ case Instruction::PHI:
+ for (Value *IncomingValue : cast<PHINode>(CurInst)->incoming_values()) {
+ // Fast path for recurrence PHI.
+ if (IncomingValue == CurInst)
+ continue;
+ if (!Push(IncomingValue))
+ return false;
+ }
+ break;
+ default:
+ return false;
+ }
+ }
+ return true;
+}
diff --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index cbc604e87cf1a..b2ab3666a7f26 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -6020,6 +6020,8 @@ static bool eliminateDeadSwitchCases(SwitchInst *SI, DomTreeUpdater *DTU,
const DataLayout &DL) {
Value *Cond = SI->getCondition();
KnownBits Known = computeKnownBits(Cond, DL, AC, SI);
+ SmallPtrSet<const Constant *, 4> KnownValues;
+ bool IsKnownValuesValid = collectPossibleValues(Cond, KnownValues, 4);
// We can also eliminate cases by determining that their values are outside of
// the limited range of the condition based on how many significant (non-sign)
@@ -6039,15 +6041,18 @@ static bool eliminateDeadSwitchCases(SwitchInst *SI, DomTreeUpdater *DTU,
UniqueSuccessors.push_back(Successor);
++It->second;
}
- const APInt &CaseVal = Case.getCaseValue()->getValue();
+ ConstantInt *CaseC = Case.getCaseValue();
+ const APInt &CaseVal = CaseC->getValue();
if (Known.Zero.intersects(CaseVal) || !Known.One.isSubsetOf(CaseVal) ||
- (CaseVal.getSignificantBits() > MaxSignificantBitsInCond)) {
- DeadCases.push_back(Case.getCaseValue());
+ (CaseVal.getSignificantBits() > MaxSignificantBitsInCond) ||
+ (IsKnownValuesValid && !KnownValues.contains(CaseC))) {
+ DeadCases.push_back(CaseC);
if (DTU)
--NumPerSuccessorCases[Successor];
LLVM_DEBUG(dbgs() << "SimplifyCFG: switch case " << CaseVal
<< " is dead.\n");
- }
+ } else if (IsKnownValuesValid)
+ KnownValues.erase(CaseC);
}
// If we can prove that the cases must cover all possible values, the
@@ -6058,33 +6063,41 @@ static bool eliminateDeadSwitchCases(SwitchInst *SI, DomTreeUpdater *DTU,
const unsigned NumUnknownBits =
Known.getBitWidth() - (Known.Zero | Known.One).popcount();
assert(NumUnknownBits <= Known.getBitWidth());
- if (HasDefault && DeadCases.empty() &&
- NumUnknownBits < 64 /* avoid overflow */) {
- uint64_t AllNumCases = 1ULL << NumUnknownBits;
- if (SI->getNumCases() == AllNumCases) {
+ if (HasDefault && DeadCases.empty()) {
+ if (IsKnownValuesValid && all_of(KnownValues, IsaPred<UndefValue>)) {
createUnreachableSwitchDefault(SI, DTU);
return true;
}
- // When only one case value is missing, replace default with that case.
- // Eliminating the default branch will provide more opportunities for
- // optimization, such as lookup tables.
- if (SI->getNumCases() == AllNumCases - 1) {
- assert(NumUnknownBits > 1 && "Should be canonicalized to a branch");
- IntegerType *CondTy = cast<IntegerType>(Cond->getType());
- if (CondTy->getIntegerBitWidth() > 64 ||
- !DL.fitsInLegalInteger(CondTy->getIntegerBitWidth()))
- return false;
- uint64_t MissingCaseVal = 0;
- for (const auto &Case : SI->cases())
- MissingCaseVal ^= Case.getCaseValue()->getValue().getLimitedValue();
- auto *MissingCase =
- cast<ConstantInt>(ConstantInt::get(Cond->getType(), MissingCaseVal));
- SwitchInstProfUpdateWrapper SIW(*SI);
- SIW.addCase(MissingCase, SI->getDefaultDest(), SIW.getSuccessorWeight(0));
- createUnreachableSwitchDefault(SI, DTU, /*RemoveOrigDefaultBlock*/ false);
- SIW.setSuccessorWeight(0, 0);
- return true;
+ if (NumUnknownBits < 64 /* avoid overflow */) {
+ uint64_t AllNumCases = 1ULL << NumUnknownBits;
+ if (SI->getNumCases() == AllNumCases) {
+ createUnreachableSwitchDefault(SI, DTU);
+ return true;
+ }
+ // When only one case value is missing, replace default with that case.
+ // Eliminating the default branch will provide more opportunities for
+ // optimization, such as lookup tables.
+ if (SI->getNumCases() == AllNumCases - 1) {
+ assert(NumUnknownBits > 1 && "Should be canonicalized to a branch");
+ IntegerType *CondTy = cast<IntegerType>(Cond->getType());
+ if (CondTy->getIntegerBitWidth() > 64 ||
+ !DL.fitsInLegalInteger(CondTy->getIntegerBitWidth()))
+ return false;
+
+ uint64_t MissingCaseVal = 0;
+ for (const auto &Case : SI->cases())
+ MissingCaseVal ^= Case.getCaseValue()->getValue().getLimitedValue();
+ auto *MissingCase = cast<ConstantInt>(
+ ConstantInt::get(Cond->getType(), MissingCaseVal));
+ SwitchInstProfUpdateWrapper SIW(*SI);
+ SIW.addCase(MissingCase, SI->getDefaultDest(),
+ SIW.getSuccessorWeight(0));
+ createUnreachableSwitchDefault(SI, DTU,
+ /*RemoveOrigDefaultBlock*/ false);
+ SIW.setSuccessorWeight(0, 0);
+ return true;
+ }
}
}
diff --git a/llvm/test/CodeGen/AArch64/arm64-ccmp.ll b/llvm/test/CodeGen/AArch64/arm64-ccmp.ll
index cad5df0d9655e..68ab8902767b3 100644
--- a/llvm/test/CodeGen/AArch64/arm64-ccmp.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-ccmp.ll
@@ -430,12 +430,12 @@ declare i32 @foo()
; Test case distilled from 126.gcc.
; The phi in sw.bb.i.i gets multiple operands for the %entry predecessor.
-define void @build_modify_expr() nounwind ssp {
+define void @build_modify_expr(i32 %cond) nounwind ssp {
; CHECK-LABEL: build_modify_expr:
; CHECK: ; %bb.0: ; %entry
; CHECK-NEXT: ret
entry:
- switch i32 undef, label %sw.bb.i.i [
+ switch i32 %cond, label %sw.bb.i.i [
i32 69, label %if.end85
i32 70, label %if.end85
i32 71, label %if.end85
diff --git a/llvm/test/CodeGen/Hexagon/vect/zext-v4i1.ll b/llvm/test/CodeGen/Hexagon/vect/zext-v4i1.ll
index 559bb68741e12..930cf8152b756 100644
--- a/llvm/test/CodeGen/Hexagon/vect/zext-v4i1.ll
+++ b/llvm/test/CodeGen/Hexagon/vect/zext-v4i1.ll
@@ -6,11 +6,11 @@
target datalayout = "e-m:e-p:32:32:32-a:0-n16:32-i64:64:64-i32:32:32-i16:16:16-i1:8:8-f32:32:32-f64:64:64-v32:32:32-v64:64:64-v512:512:512-v1024:1024:1024-v2048:2048:2048"
target triple = "hexagon"
-define i32 @fred(ptr %a0) #0 {
+define i32 @fred(ptr %a0, i32 %cond) #0 {
; CHECK-LABEL: fred:
; CHECK: // %bb.0: // %b0
; CHECK-NEXT: {
-; CHECK-NEXT: if (p0) jump:nt .LBB0_2
+; CHECK-NEXT: p0 = cmp.eq(r1,#5); if (!p0.new) jump:t .LBB0_2
; CHECK-NEXT: }
; CHECK-NEXT: // %bb.1: // %b2
; CHECK-NEXT: {
@@ -40,7 +40,7 @@ define i32 @fred(ptr %a0) #0 {
; CHECK-NEXT: jumpr r31
; CHECK-NEXT: }
b0:
- switch i32 undef, label %b14 [
+ switch i32 %cond, label %b14 [
i32 5, label %b2
i32 3, label %b1
]
diff --git a/llvm/test/Transforms/SimplifyCFG/switch-on-const-select.ll b/llvm/test/Transforms/SimplifyCFG/switch-on-const.ll
similarity index 54%
rename from llvm/test/Transforms/SimplifyCFG/switch-on-const-select.ll
rename to llvm/test/Transforms/SimplifyCFG/switch-on-const.ll
index e8b58639c13dd..1ab1b5e8bd838 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch-on-const-select.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch-on-const.ll
@@ -154,6 +154,132 @@ bees:
unreachable
}
+define void @pr165179(i1 %cond) {
+; CHECK-LABEL: @pr165179(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br i1 [[COND:%.*]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: tail call void @bees.a() #[[ATTR0]]
+; CHECK-NEXT: br label [[SWITCHBB:%.*]]
+; CHECK: if.else:
+; CHECK-NEXT: tail call void @bees.b() #[[ATTR0]]
+; CHECK-NEXT: br label [[SWITCHBB]]
+; CHECK: exit:
+; CHECK-NEXT: tail call void @bees.a() #[[ATTR0]]
+; CHECK-NEXT: ret void
+;
+entry:
+ br i1 %cond, label %if.then, label %if.else
+
+if.then:
+ tail call void @bees.a() nounwind
+ br label %switchbb
+
+if.else:
+ tail call void @bees.b() nounwind
+ br label %switchbb
+
+switchbb:
+ %cond1 = phi i32 [ 1, %if.else ], [ -1, %if.then ]
+ switch i32 %cond1, label %default [
+ i32 1, label %exit
+ i32 -1, label %exit
+ ]
+
+exit:
+ tail call void @bees.a() nounwind
+ ret void
+
+default:
+ tail call void @bees.b() nounwind
+ ret void
+}
+
+define void @switch_remove_dead_case_phi(i1 %cond1, i1 %cond2) {
+; CHECK-LABEL: @switch_remove_dead_case_phi(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br i1 [[COND1:%.*]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: tail call void @bees.a() #[[ATTR0]]
+; CHECK-NEXT: br i1 [[COND2:%.*]], label [[SWITCHBB:%.*]], label [[IF_ELSE]]
+; CHECK: if.else:
+; CHECK-NEXT: [[PHI:%.*]] = phi i32 [ 3, [[ENTRY:%.*]] ], [ -1, [[IF_THEN]] ]
+; CHECK-NEXT: tail call void @bees.b() #[[ATTR0]]
+; CHECK-NEXT: br label [[SWITCHBB]]
+; CHECK: switchbb:
+; CHECK-NEXT: [[COND:%.*]] = phi i32 [ [[PHI]], [[IF_ELSE]] ], [ 5, [[IF_THEN]] ]
+; CHECK-NEXT: [[COND3:%.*]] = icmp eq i32 [[COND]], -1
+; CHECK-NEXT: br i1 [[COND3]], label [[EXIT:%.*]], label [[DEFAULT:%.*]]
+; CHECK: common.ret:
+; CHECK-NEXT: ret void
+; CHECK: exit:
+; CHECK-NEXT: tail call void @bees.a() #[[ATTR0]]
+; CHECK-NEXT: br label [[COMMON_RET:%.*]]
+; CHECK: default:
+; CHECK-NEXT: tail call void @bees.b() #[[ATTR0]]
+; CHECK-NEXT: br label [[COMMON_RET]]
+;
+entry:
+ br i1 %cond1, label %if.then, label %if.else
+
+if.then:
+ tail call void @bees.a() nounwind
+ br i1 %cond2, label %switchbb, label %if.else
+
+if.else:
+ %phi = phi i32 [ 3, %entry ], [ -1, %if.then ]
+ tail call void @bees.b() nounwind
+ br label %switchbb
+
+switchbb:
+ %cond = phi i32 [ %phi, %if.else ], [ 5, %if.then ]
+ switch i32 %cond, label %default [
+ i32 1, label %exit
+ i32 -1, label %exit
+ ]
+
+exit:
+ tail call void @bees.a() nounwind
+ ret void
+
+default:
+ tail call void @bees.b() nounwind
+ ret void
+}
+
+define void @switch_remove_dead_case_select(i1 %cond1, i1 %cond2) {
+; CHECK-LABEL: @switch_remove_dead_case_select(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[X:%.*]] = select i1 [[COND1:%.*]], i32 -1, i32 3
+; CHECK-NEXT: [[Y:%.*]] = select i1 [[COND2:%.*]], i32 [[X]], i32 5
+; CHECK-NEXT: [[COND:%.*]] = icmp eq i32 [[Y]], -1
+; CHECK-NEXT: br i1 [[COND]], label [[EXIT:%.*]], label [[DEFAULT:%.*]]
+; CHECK: common.ret:
+; CHECK-NEXT: ret void
+; CHECK: exit:
+; CHECK-NEXT: tail call void @bees.a() #[[ATTR0]]
+; CHECK-NEXT: br label [[COMMON_RET:%.*]]
+; CHECK: default:
+; CHECK-NEXT: tail call void @bees.b() #[[ATTR0]]
+; CHECK-NEXT: br label [[COMMON_RET]]
+;
+entry:
+ %x = select i1 %cond1, i32 -1, i32 3
+ %y = select i1 %cond2, i32 %x, i32 5
+ switch i32 %y, label %default [
+ i32 1, label %exit
+ i32 -1, label %exit
+ ]
+
+exit:
+ tail call void @bees.a() nounwind
+ ret void
+
+default:
+ tail call void @bees.b() nounwind
+ ret void
+}
+
declare void @llvm.trap() nounwind noreturn
declare void @bees.a() nounwind
declare void @bees.b() nounwind
diff --git a/llvm/test/Transforms/SimplifyCFG/switch_mask.ll b/llvm/test/Transforms/SimplifyCFG/switch_mask.ll
index f8bcbc057a7ae..428c18fc18e3d 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch_mask.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch_mask.ll
@@ -221,6 +221,7 @@ define i1 @pr88607() {
; CHECK-NEXT: entry:
; CHECK-NEXT: [[COND:%.*]] = select i1 false, i32 4, i32 1
; CHECK-NEXT: [[SPEC_SELECT:%.*]] = select i1 false, i32 2, i32 [[COND]]
+; CHECK-NEXT: [[COND1:%.*]] = icmp eq i32 [[SPEC_SELECT]], 1
; CHECK-NEXT: ret i1 false
;
entry:
diff --git a/llvm/test/Transforms/SimplifyCFG/switch_undef.ll b/llvm/test/Transforms/SimplifyCFG/switch_undef.ll
index 88a729b7d941a..4de5ea948ed27 100644
--- a/llvm/test/Transforms/SimplifyCFG/switch_undef.ll
+++ b/llvm/test/Transforms/SimplifyCFG/switch_undef.ll
@@ -5,12 +5,11 @@
define void @f6() #0 {
; CHECK-LABEL: @f6(
; CHECK-NEXT: entry:
-; CHECK-NEXT: br label [[FOR_COND_I:%.*]]
-; CHECK: for.cond.i:
+; CHECK-NEXT: br label [[F1_EXIT_I:%.*]]
+; CHECK: f1.exit.i:
; CHECK-NEXT: [[TOBOOL7_I:%.*]] = icmp ne i16 1, 0
-; CHECK-NEXT: br label [[FOR_COND_I]]
+; CHECK-NEXT: br label [[F1_EXIT_I]]
;
-
entry:
br label %for.cond.i
|
dianqk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 LGTM.
|
This patch seem to be a bit aggressive in removing switch cases. https://godbolt.org/z/9GfWqPcrh - C example |
| (CaseVal.getSignificantBits() > MaxSignificantBitsInCond)) { | ||
| DeadCases.push_back(Case.getCaseValue()); | ||
| (CaseVal.getSignificantBits() > MaxSignificantBitsInCond) || | ||
| (IsKnownValuesValid && !KnownValues.contains(CaseC))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work properly.
For the example given by @henrickarlsson collectPossibleValues is picking up "i32 ptrtoint (ptr @g to i32)" as a known value., The contains check here will not match the ConstantInt's found in the switch cases with any not yet resolved symbols.
Maybe collectPossibleValues only should return true when finding ConstantInt values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henrickarlsson @bjope Thanks for the report. I will take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #168084
In #165748 constant expressions were allowed in `collectPossibleValues` because we are still using insertelement + shufflevector idioms to represent a scalable vector splat. However, it also accepts some unresolved constants like ptrtoint of globals or pointer difference between two globals. Absolutely we can ask the user to check this case with the constant folding API. However, since we don't observe the real-world usefulness of handling constant expressions, I decide to be more conservative and only handle immediate constants in the helper function. With this patch, we don't need to touch the SimplifyCFG part, as the values can only be either ConstantInt or undef/poison values (NB: switch on undef condition is UB). Fix the miscompilation reported by #165748 (comment)
…ions (#168084) In llvm/llvm-project#165748 constant expressions were allowed in `collectPossibleValues` because we are still using insertelement + shufflevector idioms to represent a scalable vector splat. However, it also accepts some unresolved constants like ptrtoint of globals or pointer difference between two globals. Absolutely we can ask the user to check this case with the constant folding API. However, since we don't observe the real-world usefulness of handling constant expressions, I decide to be more conservative and only handle immediate constants in the helper function. With this patch, we don't need to touch the SimplifyCFG part, as the values can only be either ConstantInt or undef/poison values (NB: switch on undef condition is UB). Fix the miscompilation reported by llvm/llvm-project#165748 (comment)
In simplifycfg/cvp/sccp, we eliminate dead edges of switches according to the knownbits/range info of conditions. However, these approximations may not meet the real-world needs when the domain of condition values is sparse. For example, if the condition can only be either -3 or 3, we cannot prove that the condition never evaluates to 1 (knownbits: ???????1, range: [-3, 4)).
This patch adds a helper function
collectPossibleValuesto enumerate all the possible values of V. To fix the motivating issue,eliminateDeadSwitchCaseswill use the result to remove dead edges.Note: In https://discourse.llvm.org/t/missed-optimization-due-to-overflow-check/88700 I proposed a new value lattice kind to represent such values. But I find it hard to apply because the transition becomes much complicated.
Compile-time impact looks neutral: https://llvm-compile-time-tracker.com/compare.php?from=32d6b2139a6c8f79e074e8c6cfe0cc9e79c4c0c8&to=e47c26e3f1bf9eb062684dda4fafce58438e994b&stat=instructions:u
This patch removes many dead error-handling codes: dtcxzyw/llvm-opt-benchmark#3012
Closes #165179.