-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[ValueTracking] Support GEPs in matchSimpleRecurrence. #123518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
f8140fd to
a8e0e4a
Compare
|
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesUpdate matchSimpleRecurrence to also support GEPs. This allows inferring I noticed that we fail to infer alignments from calls when dropping For now, it is limited to cases where the source element type is i8. It comes with a bit of a compile-time impact: stage1-O3: +0.05% Patch is 25.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123518.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index b4918c2d1e8a18..8b72e605342f14 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -1245,7 +1245,11 @@ bool matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO, Value *&Start,
Value *&Step);
/// Analogous to the above, but starting from the binary operator
-bool matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P, Value *&Start,
+bool matchSimpleRecurrence(const Instruction *I, PHINode *&P, Value *&Start,
+ Value *&Step);
+
+/// Analogous to the above, but also supporting non-binary operators.
+bool matchSimpleRecurrence(const PHINode *P, Instruction *&BO, Value *&Start,
Value *&Step);
/// Return true if RHS is known to be implied true by LHS. Return false if
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6e2f0ebde9bb6c..d9c2ce4df92e7c 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1489,7 +1489,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
}
case Instruction::PHI: {
const PHINode *P = cast<PHINode>(I);
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *R = nullptr, *L = nullptr;
if (matchSimpleRecurrence(P, BO, R, L)) {
// Handle the case of a simple two-predecessor recurrence PHI.
@@ -1553,6 +1553,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
case Instruction::Sub:
case Instruction::And:
case Instruction::Or:
+ case Instruction::GetElementPtr:
case Instruction::Mul: {
// Change the context instruction to the "edge" that flows into the
// phi. This is important because that is where the value is actually
@@ -1571,12 +1572,21 @@ static void computeKnownBitsFromOperator(const Operator *I,
// We need to take the minimum number of known bits
KnownBits Known3(BitWidth);
+ if (BitWidth != getBitWidth(L->getType(), Q.DL)) {
+ assert(isa<GetElementPtrInst>(BO) &&
+ "Bitwidth should only be different for GEPs.");
+ break;
+ }
RecQ.CxtI = LInst;
computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
Known3.countMinTrailingZeros()));
+ // Don't apply logic below for GEPs.
+ if (isa<GetElementPtrInst>(BO))
+ break;
+
auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(BO);
if (!OverflowOp || !Q.IIQ.hasNoSignedWrap(OverflowOp))
break;
@@ -1737,6 +1747,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
Known.resetAll();
}
}
+
if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
switch (II->getIntrinsicID()) {
default:
@@ -2270,7 +2281,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
/// always a power of two (or zero).
static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
unsigned Depth, SimplifyQuery &Q) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *Start = nullptr, *Step = nullptr;
if (!matchSimpleRecurrence(PN, BO, Start, Step))
return false;
@@ -2308,7 +2319,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
// Divisor must be a power of two.
// If OrZero is false, cannot guarantee induction variable is non-zero after
// division, same for Shr, unless it is exact division.
- return (OrZero || Q.IIQ.isExact(BO)) &&
+ return (OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO))) &&
isKnownToBeAPowerOfTwo(Step, false, Depth, Q);
case Instruction::Shl:
return OrZero || Q.IIQ.hasNoUnsignedWrap(BO) || Q.IIQ.hasNoSignedWrap(BO);
@@ -2317,7 +2328,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
return false;
[[fallthrough]];
case Instruction::LShr:
- return OrZero || Q.IIQ.isExact(BO);
+ return OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO));
default:
return false;
}
@@ -2727,7 +2738,7 @@ static bool rangeMetadataExcludesValue(const MDNode* Ranges, const APInt& Value)
/// Try to detect a recurrence that monotonically increases/decreases from a
/// non-zero starting value. These are common as induction variables.
static bool isNonZeroRecurrence(const PHINode *PN) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *Start = nullptr, *Step = nullptr;
const APInt *StartC, *StepC;
if (!matchSimpleRecurrence(PN, BO, Start, Step) ||
@@ -3560,9 +3571,9 @@ getInvertibleOperands(const Operator *Op1,
// If PN1 and PN2 are both recurrences, can we prove the entire recurrences
// are a single invertible function of the start values? Note that repeated
// application of an invertible function is also invertible
- BinaryOperator *BO1 = nullptr;
+ Instruction *BO1 = nullptr;
Value *Start1 = nullptr, *Step1 = nullptr;
- BinaryOperator *BO2 = nullptr;
+ Instruction *BO2 = nullptr;
Value *Start2 = nullptr, *Step2 = nullptr;
if (PN1->getParent() != PN2->getParent() ||
!matchSimpleRecurrence(PN1, BO1, Start1, Step1) ||
@@ -9199,6 +9210,17 @@ llvm::canConvertToMinOrMaxIntrinsic(ArrayRef<Value *> VL) {
bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
Value *&Start, Value *&Step) {
+ Instruction *I;
+ if (matchSimpleRecurrence(P, I, Start, Step)) {
+ BO = dyn_cast<BinaryOperator>(I);
+ if (BO)
+ return true;
+ }
+ return false;
+}
+
+bool llvm::matchSimpleRecurrence(const PHINode *P, Instruction *&BO,
+ Value *&Start, Value *&Step) {
// Handle the case of a simple two-predecessor recurrence PHI.
// There's a lot more that could theoretically be done here, but
// this is sufficient to catch some interesting cases.
@@ -9208,7 +9230,7 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
for (unsigned i = 0; i != 2; ++i) {
Value *L = P->getIncomingValue(i);
Value *R = P->getIncomingValue(!i);
- auto *LU = dyn_cast<BinaryOperator>(L);
+ auto *LU = dyn_cast<Instruction>(L);
if (!LU)
continue;
unsigned Opcode = LU->getOpcode();
@@ -9240,6 +9262,21 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
break; // Match!
}
+ case Instruction::GetElementPtr: {
+ if (LU->getNumOperands() != 2 ||
+ !cast<GetElementPtrInst>(L)->getSourceElementType()->isIntegerTy(8))
+ continue;
+
+ Value *LL = LU->getOperand(0);
+ Value *LR = LU->getOperand(1);
+ // Find a recurrence.
+ if (LL == P) {
+ // Found a match
+ L = LR;
+ break;
+ }
+ continue;
+ }
};
// We have matched a recurrence of the form:
@@ -9256,9 +9293,9 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
return false;
}
-bool llvm::matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P,
+bool llvm::matchSimpleRecurrence(const Instruction *I, PHINode *&P,
Value *&Start, Value *&Step) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
P = dyn_cast<PHINode>(I->getOperand(0));
if (!P)
P = dyn_cast<PHINode>(I->getOperand(1));
diff --git a/llvm/test/Transforms/InferAlignment/gep-recurrence.ll b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
new file mode 100644
index 00000000000000..f51875adcd862f
--- /dev/null
+++ b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
@@ -0,0 +1,574 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt < %s -passes=infer-alignment -S | FileCheck %s
+
+target datalayout = "p1:64:64:64:32"
+
+declare i1 @cond()
+
+define void @test_recur_i8_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_128_no_inbounds(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128_no_inbounds
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr i8, ptr [[IV]], i64 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr i8, ptr %iv, i64 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_64(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_64
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 64
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 64
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 64
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_63(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_63
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 63
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 63
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_32(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_32
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 32
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 32
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 32
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_16(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_16
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 16
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 16
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 16
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_8(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_8
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 8
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 8
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 8
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_4(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_4
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 4
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 4
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 4
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_2(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_2
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 2
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 2
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 2
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_1(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_1
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 1
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 1
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_unknown_step(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_unknown_step
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_step_known_multiple(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_step_known_multiple
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[UREM:%.*]] = urem i64 [[OFF]], 128
+; CHECK-NEXT: [[C_UREM:%.*]] = icmp eq i64 [[UREM]], 0
+; CHECK-NEXT: [[C_POS:%.*]] = icmp sge i64 [[OFF]], 0
+; CHECK-NEXT: [[AND:%.*]] = and i1 [[C_UREM]], [[C_POS]]
+; CHECK-NEXT: br i1 [[AND]], label [[LOOP:%.*]], label [[EXIT:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ %urem = urem i64 %off, 128
+ %c.urem = icmp eq i64 %urem, 0
+ %c.pos = icmp sge i64 %off, 0
+ %and = and i1 %c.urem, %c.pos
+ br i1 %and, label %loop, label %exit
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_i16_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i16_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i16 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i16 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_i8_132(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i8_132
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHEC...
[truncated]
|
|
@llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesUpdate matchSimpleRecurrence to also support GEPs. This allows inferring I noticed that we fail to infer alignments from calls when dropping For now, it is limited to cases where the source element type is i8. It comes with a bit of a compile-time impact: stage1-O3: +0.05% Patch is 25.88 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/123518.diff 3 Files Affected:
diff --git a/llvm/include/llvm/Analysis/ValueTracking.h b/llvm/include/llvm/Analysis/ValueTracking.h
index b4918c2d1e8a18..8b72e605342f14 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -1245,7 +1245,11 @@ bool matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO, Value *&Start,
Value *&Step);
/// Analogous to the above, but starting from the binary operator
-bool matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P, Value *&Start,
+bool matchSimpleRecurrence(const Instruction *I, PHINode *&P, Value *&Start,
+ Value *&Step);
+
+/// Analogous to the above, but also supporting non-binary operators.
+bool matchSimpleRecurrence(const PHINode *P, Instruction *&BO, Value *&Start,
Value *&Step);
/// Return true if RHS is known to be implied true by LHS. Return false if
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6e2f0ebde9bb6c..d9c2ce4df92e7c 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1489,7 +1489,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
}
case Instruction::PHI: {
const PHINode *P = cast<PHINode>(I);
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *R = nullptr, *L = nullptr;
if (matchSimpleRecurrence(P, BO, R, L)) {
// Handle the case of a simple two-predecessor recurrence PHI.
@@ -1553,6 +1553,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
case Instruction::Sub:
case Instruction::And:
case Instruction::Or:
+ case Instruction::GetElementPtr:
case Instruction::Mul: {
// Change the context instruction to the "edge" that flows into the
// phi. This is important because that is where the value is actually
@@ -1571,12 +1572,21 @@ static void computeKnownBitsFromOperator(const Operator *I,
// We need to take the minimum number of known bits
KnownBits Known3(BitWidth);
+ if (BitWidth != getBitWidth(L->getType(), Q.DL)) {
+ assert(isa<GetElementPtrInst>(BO) &&
+ "Bitwidth should only be different for GEPs.");
+ break;
+ }
RecQ.CxtI = LInst;
computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
Known3.countMinTrailingZeros()));
+ // Don't apply logic below for GEPs.
+ if (isa<GetElementPtrInst>(BO))
+ break;
+
auto *OverflowOp = dyn_cast<OverflowingBinaryOperator>(BO);
if (!OverflowOp || !Q.IIQ.hasNoSignedWrap(OverflowOp))
break;
@@ -1737,6 +1747,7 @@ static void computeKnownBitsFromOperator(const Operator *I,
Known.resetAll();
}
}
+
if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
switch (II->getIntrinsicID()) {
default:
@@ -2270,7 +2281,7 @@ void computeKnownBits(const Value *V, const APInt &DemandedElts,
/// always a power of two (or zero).
static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
unsigned Depth, SimplifyQuery &Q) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *Start = nullptr, *Step = nullptr;
if (!matchSimpleRecurrence(PN, BO, Start, Step))
return false;
@@ -2308,7 +2319,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
// Divisor must be a power of two.
// If OrZero is false, cannot guarantee induction variable is non-zero after
// division, same for Shr, unless it is exact division.
- return (OrZero || Q.IIQ.isExact(BO)) &&
+ return (OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO))) &&
isKnownToBeAPowerOfTwo(Step, false, Depth, Q);
case Instruction::Shl:
return OrZero || Q.IIQ.hasNoUnsignedWrap(BO) || Q.IIQ.hasNoSignedWrap(BO);
@@ -2317,7 +2328,7 @@ static bool isPowerOfTwoRecurrence(const PHINode *PN, bool OrZero,
return false;
[[fallthrough]];
case Instruction::LShr:
- return OrZero || Q.IIQ.isExact(BO);
+ return OrZero || Q.IIQ.isExact(cast<BinaryOperator>(BO));
default:
return false;
}
@@ -2727,7 +2738,7 @@ static bool rangeMetadataExcludesValue(const MDNode* Ranges, const APInt& Value)
/// Try to detect a recurrence that monotonically increases/decreases from a
/// non-zero starting value. These are common as induction variables.
static bool isNonZeroRecurrence(const PHINode *PN) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
Value *Start = nullptr, *Step = nullptr;
const APInt *StartC, *StepC;
if (!matchSimpleRecurrence(PN, BO, Start, Step) ||
@@ -3560,9 +3571,9 @@ getInvertibleOperands(const Operator *Op1,
// If PN1 and PN2 are both recurrences, can we prove the entire recurrences
// are a single invertible function of the start values? Note that repeated
// application of an invertible function is also invertible
- BinaryOperator *BO1 = nullptr;
+ Instruction *BO1 = nullptr;
Value *Start1 = nullptr, *Step1 = nullptr;
- BinaryOperator *BO2 = nullptr;
+ Instruction *BO2 = nullptr;
Value *Start2 = nullptr, *Step2 = nullptr;
if (PN1->getParent() != PN2->getParent() ||
!matchSimpleRecurrence(PN1, BO1, Start1, Step1) ||
@@ -9199,6 +9210,17 @@ llvm::canConvertToMinOrMaxIntrinsic(ArrayRef<Value *> VL) {
bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
Value *&Start, Value *&Step) {
+ Instruction *I;
+ if (matchSimpleRecurrence(P, I, Start, Step)) {
+ BO = dyn_cast<BinaryOperator>(I);
+ if (BO)
+ return true;
+ }
+ return false;
+}
+
+bool llvm::matchSimpleRecurrence(const PHINode *P, Instruction *&BO,
+ Value *&Start, Value *&Step) {
// Handle the case of a simple two-predecessor recurrence PHI.
// There's a lot more that could theoretically be done here, but
// this is sufficient to catch some interesting cases.
@@ -9208,7 +9230,7 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
for (unsigned i = 0; i != 2; ++i) {
Value *L = P->getIncomingValue(i);
Value *R = P->getIncomingValue(!i);
- auto *LU = dyn_cast<BinaryOperator>(L);
+ auto *LU = dyn_cast<Instruction>(L);
if (!LU)
continue;
unsigned Opcode = LU->getOpcode();
@@ -9240,6 +9262,21 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
break; // Match!
}
+ case Instruction::GetElementPtr: {
+ if (LU->getNumOperands() != 2 ||
+ !cast<GetElementPtrInst>(L)->getSourceElementType()->isIntegerTy(8))
+ continue;
+
+ Value *LL = LU->getOperand(0);
+ Value *LR = LU->getOperand(1);
+ // Find a recurrence.
+ if (LL == P) {
+ // Found a match
+ L = LR;
+ break;
+ }
+ continue;
+ }
};
// We have matched a recurrence of the form:
@@ -9256,9 +9293,9 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
return false;
}
-bool llvm::matchSimpleRecurrence(const BinaryOperator *I, PHINode *&P,
+bool llvm::matchSimpleRecurrence(const Instruction *I, PHINode *&P,
Value *&Start, Value *&Step) {
- BinaryOperator *BO = nullptr;
+ Instruction *BO = nullptr;
P = dyn_cast<PHINode>(I->getOperand(0));
if (!P)
P = dyn_cast<PHINode>(I->getOperand(1));
diff --git a/llvm/test/Transforms/InferAlignment/gep-recurrence.ll b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
new file mode 100644
index 00000000000000..f51875adcd862f
--- /dev/null
+++ b/llvm/test/Transforms/InferAlignment/gep-recurrence.ll
@@ -0,0 +1,574 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt < %s -passes=infer-alignment -S | FileCheck %s
+
+target datalayout = "p1:64:64:64:32"
+
+declare i1 @cond()
+
+define void @test_recur_i8_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_128_no_inbounds(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_128_no_inbounds
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 128
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr i8, ptr [[IV]], i64 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr i8, ptr %iv, i64 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_64(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_64
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 64
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 64
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 64
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_63(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_63
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 63
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 63
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_32(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_32
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 32
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 32
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 32
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_16(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_16
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 16
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 16
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 16
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_8(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_8
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 8
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 8
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 8
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_4(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_4
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 4
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 4
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 4
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_2(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_2
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 2
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 2
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 2
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_1(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_1
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 1
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 1
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_unknown_step(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_unknown_step
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_step_known_multiple(ptr align 128 %dst, i64 %off) {
+; CHECK-LABEL: define void @test_recur_i8_step_known_multiple
+; CHECK-SAME: (ptr align 128 [[DST:%.*]], i64 [[OFF:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[UREM:%.*]] = urem i64 [[OFF]], 128
+; CHECK-NEXT: [[C_UREM:%.*]] = icmp eq i64 [[UREM]], 0
+; CHECK-NEXT: [[C_POS:%.*]] = icmp sge i64 [[OFF]], 0
+; CHECK-NEXT: [[AND:%.*]] = and i1 [[C_UREM]], [[C_POS]]
+; CHECK-NEXT: br i1 [[AND]], label [[LOOP:%.*]], label [[EXIT:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i64 [[OFF]]
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ %urem = urem i64 %off, 128
+ %c.urem = icmp eq i64 %urem, 0
+ %c.pos = icmp sge i64 %off, 0
+ %and = and i1 %c.urem, %c.pos
+ br i1 %and, label %loop, label %exit
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i64 %off
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_i16_128(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i16_128
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: [[IV:%.*]] = phi ptr [ [[DST]], [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
+; CHECK-NEXT: store i64 0, ptr [[IV]], align 1
+; CHECK-NEXT: [[IV_NEXT]] = getelementptr inbounds i8, ptr [[IV]], i16 128
+; CHECK-NEXT: [[C:%.*]] = call i1 @cond()
+; CHECK-NEXT: br i1 [[C]], label [[LOOP]], label [[EXIT:%.*]]
+; CHECK: exit:
+; CHECK-NEXT: ret void
+;
+entry:
+ br label %loop
+
+loop:
+ %iv = phi ptr [ %dst, %entry ], [ %iv.next, %loop ]
+ store i64 0, ptr %iv, align 1
+ %iv.next = getelementptr inbounds i8, ptr %iv, i16 128
+ %c = call i1 @cond()
+ br i1 %c, label %loop, label %exit
+
+exit:
+ ret void
+}
+
+define void @test_recur_i8_i8_132(ptr align 128 %dst) {
+; CHECK-LABEL: define void @test_recur_i8_i8_132
+; CHECK-SAME: (ptr align 128 [[DST:%.*]]) {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHEC...
[truncated]
|
| case Instruction::PHI: { | ||
| const PHINode *P = cast<PHINode>(I); | ||
| BinaryOperator *BO = nullptr; | ||
| Instruction *BO = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think BO should be renamed. That's probably not worth it given the large amount of unrelated diffs it will create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could rename it as NFC after the change lands?
|
llvm-opt-benchmark results: dtcxzyw/llvm-opt-benchmark#1982 Stronger alignment in many cases |
a8e0e4a to
c8ea5c7
Compare
|
ping :) |
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits.
|
It is weird that this patch blocks some constant folding :( |
c8ea5c7 to
dfa86f2
Compare
Look through inttoptr (add (ptrtoint P), C) when accumulating offsets. Adds a missing fold after llvm#123518 Alive2 for the tests with changes: https://alive2.llvm.org/ce/z/VvPrzv
Look through inttoptr (add (ptrtoint P), C) when accumulating offsets. Adds a missing fold after llvm#123518 Alive2 for the tests with changes: https://alive2.llvm.org/ce/z/VvPrzv
Look through inttoptr (add (ptrtoint P), C) when accumulating offsets. Adds a missing fold after llvm#123518 Alive2 for the tests with changes: https://alive2.llvm.org/ce/z/VvPrzv
…124981) Look through inttoptr (add (ptrtoint P), C) when accumulating offsets. Adds a missing fold after #123518 Alive2 for the tests with changes: https://alive2.llvm.org/ce/z/VvPrzv PR: #124981
dfa86f2 to
cce1216
Compare
…ntOffsets (#124981) Look through inttoptr (add (ptrtoint P), C) when accumulating offsets. Adds a missing fold after llvm/llvm-project#123518 Alive2 for the tests with changes: https://alive2.llvm.org/ce/z/VvPrzv PR: llvm/llvm-project#124981
|
@fhahn Can you check this case https://gist.github.com/dtcxzyw/10abd2cd4d869ef6434625bfb0de6c46? Before this patch, |
Yes I have a reproducer for this and will share a fix soon. |
Allow looking through constant expressions. Constant expressions cannot read, modify or leak the global themselves. I might be missing something, but using analyzeGlobalAux should ensure all (instruction) users that may read, modify or leak the global are checked. This fixes another regression exposed by llvm#123518.
Put up #125205 but I am not sure if I am missing anything there |
Add some test coverage for GEP recurrences in ValueTracking, #123518.
Add some test coverage for GEP recurrences in ValueTracking, llvm/llvm-project#123518.
fhahn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping :)
Updated to apply on current main.
It looks like there are now no regressions for dtcxzyw/llvm-opt-benchmark#2723.
llvm/lib/Analysis/ValueTracking.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the pointer width is different from the index width, the optimization will be disabled. Is there a real target satisfying the condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this is happening in a lot of workloads in practice, e.g. index width 64 bits and GEPs with i32 indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok to fall through as the result is guarded by std::min(Idx.countMinTrailingZeros(), Ptr.countMinTrailingZeros()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but unfortunately computeKnowNbits has some assertions that the bitwdith of the operation matches the bassed in KnowBits.
We could operate on a suitable KnownBits object for the getelementptr, and extend as needed as follow-up, if there are any cases this would help.
dtcxzyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Remember to rename the BO variables.
llvm/lib/Analysis/ValueTracking.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is ok to fall through as the result is guarded by std::min(Idx.countMinTrailingZeros(), Ptr.countMinTrailingZeros()).
5b575cc to
c9626d2
Compare
fhahn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest compile-time numbers are
stage1-O3: +0.02%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.05%
stage1-O0-g: -0.00%
stage1-aarch64-O3: +0.05%
stage2-O3: +0.03%
stage2-clang: +0.04%
Not sure if we are OK with that for this kind of change?
llvm/lib/Analysis/ValueTracking.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, but unfortunately computeKnowNbits has some assertions that the bitwdith of the operation matches the bassed in KnowBits.
We could operate on a suitable KnownBits object for the getelementptr, and extend as needed as follow-up, if there are any cases this would help.
artagnon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kindly update the comment on line 9121/9128.
|
To really fix this problem, we should make InferAlignments use proper dataflow propagation, otherwise we'll still fail to infer alignments for any non-trivial loops. |
Update matchSimpleRecurrence to also support GEPs. This allows inferring
larger alignments in a number of cases.
I noticed that we fail to infer alignments from calls when dropping
assumptions; inferring alignment from assumptions uses SCEV, if we drop
an assume for a aligned function return value, we fail to infer the
better alignment in InferAlignment without this patch.
For now, it is limited to cases where the source element type is i8.
It comes with a bit of a compile-time impact:
stage1-O3: +0.05%
stage1-ReleaseThinLTO: +0.04%
stage1-ReleaseLTO-g: +0.03%
stage1-O0-g: -0.04%
stage2-O3: +0.04%
stage2-O0-g: +0.02%
stage2-clang: +0.03%
https://llvm-compile-time-tracker.com/compare.php?from=a8c60790fd4f70a461113f0721bdb4a114ddf420&to=9a207c52e9c644691573a40ceb5b89a3c09ab609&stat=instructions:u