-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[DA] Replace delinearization for fixed size array #161822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
614ffa4
to
aa3e203
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the test result changes, and I believe it's probably ready to be merged. (It looks like the results have improved in some cases...)
;; The direction vector of `b` is [= * *]. We cannot interchange all the loops. | ||
|
||
; CHECK: Dependency matrix before interchange: | ||
; CHECK-NEXT: = * < | ||
; CHECK-NEXT: = * * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember I added this test in #139690. It should be sufficient to ensure no interchange occurs for this case.
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f | ||
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in #124901. Also enough to check no interchange happens, so I switched from checking remarks to using update_test_checks.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the remark output. Check lines from update_test_checks give no clue what is intended to be checked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted. The output has changed, but it should be fine since this test ensures that no interchange happens.
; XFAIL: * | ||
; The transformation seems to have succeeded "accidentally". | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe conflicting with #156578. I will rebase after that PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this test will be removed in #160924.
; CHECK-NEXT: da analyze - consistent output [0 0 0|<]! | ||
; CHECK-NEXT: da analyze - output [<= * *|<]! | ||
; CHECK-NEXT: Src: store i32 1, ptr %idx1, align 4 --> Dst: store i32 1, ptr %idx1, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; | ||
; FIXME: the dependence distance is not constant. Distance vector should be [* * *|<]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change means fixing a known issue.
; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %i.013, ptr %arrayidx12, align 8 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8 | ||
; CHECK-NEXT: da analyze - output [-4 -3]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems regression, but less important, I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also will be removed in #160924.
51ce079
to
c7dc3af
Compare
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Ryotaro Kasuga (kasuga-fj) ChangesReplace the delinearization that depends on type information in GEPs with what doesn't rely on it. This is WIP as the failing tests need to be updated. Full diff: https://github.com/llvm/llvm-project/pull/161822.diff 13 Files Affected:
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 8d20b0e10305b..afc4cc9fdd516 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3504,12 +3504,13 @@ bool DependenceInfo::tryDelinearizeFixedSize(
"expected src and dst scev unknowns to be equal");
});
- SmallVector<int, 4> SrcSizes;
- SmallVector<int, 4> DstSizes;
- if (!tryDelinearizeFixedSizeImpl(SE, Src, SrcAccessFn, SrcSubscripts,
- SrcSizes) ||
- !tryDelinearizeFixedSizeImpl(SE, Dst, DstAccessFn, DstSubscripts,
- DstSizes))
+ const SCEV *ElemSize = SE->getElementSize(Src);
+ assert(ElemSize == SE->getElementSize(Dst) && "Different element sizes");
+ SmallVector<const SCEV *, 4> SrcSizes, DstSizes;
+ if (!delinearizeFixedSizeArray(*SE, SE->removePointerBase(SrcAccessFn),
+ SrcSubscripts, SrcSizes, ElemSize) ||
+ !delinearizeFixedSizeArray(*SE, SE->removePointerBase(DstAccessFn),
+ DstSubscripts, DstSizes, ElemSize))
return false;
// Check that the two size arrays are non-empty and equal in length and
@@ -3535,7 +3536,7 @@ bool DependenceInfo::tryDelinearizeFixedSize(
// iff the subscripts are positive and are less than the range of the
// dimension.
if (!DisableDelinearizationChecks) {
- auto AllIndicesInRange = [&](SmallVector<int, 4> &DimensionSizes,
+ auto AllIndicesInRange = [&](ArrayRef<const SCEV *> DimensionSizes,
SmallVectorImpl<const SCEV *> &Subscripts,
Value *Ptr) {
size_t SSize = Subscripts.size();
@@ -3548,17 +3549,14 @@ bool DependenceInfo::tryDelinearizeFixedSize(
});
return false;
}
- if (auto *SType = dyn_cast<IntegerType>(S->getType())) {
- const SCEV *Range = SE->getConstant(
- ConstantInt::get(SType, DimensionSizes[I - 1], false));
- if (!isKnownLessThan(S, Range)) {
- LLVM_DEBUG({
- dbgs() << "Check failed: !isKnownLessThan(S, Range)\n";
- dbgs() << " S: " << *S << "\n"
- << " Range: " << *Range << "\n";
- });
- return false;
- }
+ const SCEV *Range = DimensionSizes[I - 1];
+ if (!isKnownLessThan(S, Range)) {
+ LLVM_DEBUG({
+ dbgs() << "Check failed: !isKnownLessThan(S, Range)\n";
+ dbgs() << " S: " << *S << "\n"
+ << " Range: " << *Range << "\n";
+ });
+ return false;
}
}
return true;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll b/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
index e0def901d1759..6dde8844c6040 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
@@ -660,7 +660,7 @@ define void @banerjee7(ptr %A, ptr %B, i64 %m, i64 %n) nounwind uwtable ssp {
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 0, ptr %arrayidx, align 8
; DELIN-NEXT: da analyze - none!
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: %0 = load i64, ptr %arrayidx7, align 8
-; DELIN-NEXT: da analyze - flow [> <=]!
+; DELIN-NEXT: da analyze - consistent flow [-1 0]!
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 %0, ptr %B.addr.11, align 8
; DELIN-NEXT: da analyze - confused!
; DELIN-NEXT: Src: %0 = load i64, ptr %arrayidx7, align 8 --> Dst: %0 = load i64, ptr %arrayidx7, align 8
@@ -916,7 +916,7 @@ define void @banerjee10(ptr %A, ptr %B, i64 %m, i64 %n) nounwind uwtable ssp {
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 0, ptr %arrayidx, align 8
; DELIN-NEXT: da analyze - none!
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: %1 = load i64, ptr %arrayidx6, align 8
-; DELIN-NEXT: da analyze - flow [<> 0]!
+; DELIN-NEXT: da analyze - flow [-11 0]!
; DELIN-NEXT: Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 %1, ptr %B.addr.11, align 8
; DELIN-NEXT: da analyze - confused!
; DELIN-NEXT: Src: %1 = load i64, ptr %arrayidx6, align 8 --> Dst: %1 = load i64, ptr %arrayidx6, align 8
diff --git a/llvm/test/Analysis/DependenceAnalysis/Coupled.ll b/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
index 1d4513429a83c..01c84c77393bc 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
@@ -503,8 +503,7 @@ define void @couple11(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %conv, ptr %arrayidx2, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
-; CHECK-NEXT: da analyze - flow [0|<] splitable!
-; CHECK-NEXT: da analyze - split level = 1, iteration = 9!
+; CHECK-NEXT: da analyze - flow [0|<]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %0, ptr %B.addr.01, align 4
; CHECK-NEXT: da analyze - confused!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx4, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
@@ -548,8 +547,7 @@ define void @couple12(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %conv, ptr %arrayidx2, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
-; CHECK-NEXT: da analyze - flow [<] splitable!
-; CHECK-NEXT: da analyze - split level = 1, iteration = 11!
+; CHECK-NEXT: da analyze - flow [<]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %0, ptr %B.addr.01, align 4
; CHECK-NEXT: da analyze - confused!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx4, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
diff --git a/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll b/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
index d9ccea55dd478..069a540ea0295 100644
--- a/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
@@ -149,11 +149,10 @@ define void @multidim_accesses(ptr %A) {
; CHECK-NEXT: Src: store i32 1, ptr %idx0, align 4 --> Dst: store i32 1, ptr %idx0, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i32 1, ptr %idx0, align 4 --> Dst: store i32 1, ptr %idx1, align 4
-; CHECK-NEXT: da analyze - consistent output [0 0 0|<]!
+; CHECK-NEXT: da analyze - output [<= * *|<]!
; CHECK-NEXT: Src: store i32 1, ptr %idx1, align 4 --> Dst: store i32 1, ptr %idx1, align 4
; CHECK-NEXT: da analyze - none!
;
-; FIXME: the dependence distance is not constant. Distance vector should be [* * *|<]!
; for (i = 0; i < 256; i++)
; for (j = 0; j < 256; j++)
; for (k = 0; k < 256; k++) {
diff --git a/llvm/test/Analysis/DependenceAnalysis/Invariant.ll b/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
index 1d8c51e475ae8..8a9d0a82eb27b 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
@@ -2,6 +2,9 @@
; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa 2>&1 \
; RUN: | FileCheck %s
+; XFAIL: *
+; Currently fails since delinearization doesn't work as expected.
+
; Test for a bug, which caused an assert when an invalid
; SCEVAddRecExpr is created in addToCoefficient.
diff --git a/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll b/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
index e5d5d21e365a1..eb832747e366e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
@@ -47,6 +47,8 @@ for.end:
; }
; }
; Extends the previous example to coupled MIV subscripts.
+;
+; FIXME: Currently delinearization does not work as expected.
@a = global [10004 x [10004 x i32]] zeroinitializer, align 16
@@ -57,7 +59,7 @@ define void @coupled_miv_type_mismatch(i32 %n) #0 {
; CHECK-NEXT: Src: %2 = load i32, ptr %arrayidx5, align 4 --> Dst: %2 = load i32, ptr %arrayidx5, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: %2 = load i32, ptr %arrayidx5, align 4 --> Dst: store i32 %add6, ptr %arrayidx10, align 4
-; CHECK-NEXT: da analyze - consistent anti [1 -2]!
+; CHECK-NEXT: da analyze - anti [< >]!
; CHECK-NEXT: Src: store i32 %add6, ptr %arrayidx10, align 4 --> Dst: store i32 %add6, ptr %arrayidx10, align 4
; CHECK-NEXT: da analyze - none!
;
diff --git a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
index 9bee38c6c00ef..2d1638d145ffe 100644
--- a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
@@ -10,7 +10,7 @@ define void @foo() {
; CHECK-NEXT: Src: store i32 42, ptr %getelementptr, align 1 --> Dst: store i32 42, ptr %getelementptr, align 1
; CHECK-NEXT: da analyze - consistent output [0 S]!
; CHECK-NEXT: Src: store i32 42, ptr %getelementptr, align 1 --> Dst: store i32 0, ptr %getelementptr5, align 1
-; CHECK-NEXT: da analyze - output [0 *|<]!
+; CHECK-NEXT: da analyze - output [0 <=|<]!
; CHECK-NEXT: Src: store i32 0, ptr %getelementptr5, align 1 --> Dst: store i32 0, ptr %getelementptr5, align 1
; CHECK-NEXT: da analyze - none!
;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Propagating.ll b/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
index 866f515baeafb..09598f43c7c7d 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
@@ -437,7 +437,8 @@ define void @prop7(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: store i32 %conv, ptr %arrayidx7, align 4
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: %0 = load i32, ptr %arrayidx13, align 4
-; CHECK-NEXT: da analyze - flow [* <>]!
+; CHECK-NEXT: da analyze - flow [* -38] splitable!
+; CHECK-NEXT: da analyze - split level = 1, iteration = 4!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: store i32 %0, ptr %B.addr.11, align 4
; CHECK-NEXT: da analyze - confused!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx13, align 4 --> Dst: %0 = load i32, ptr %arrayidx13, align 4
diff --git a/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll b/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
index 57962e01de2b4..5a51c748a344e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
@@ -148,7 +148,7 @@ define void @non_samebd0(ptr %A) nounwind uwtable ssp {
; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %i.013, ptr %arrayidx12, align 8
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8
-; CHECK-NEXT: da analyze - output [-4 -3]!
+; CHECK-NEXT: da analyze - output [> *]!
; CHECK-NEXT: Src: store i64 %l17.04, ptr %arrayidx24, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8
; CHECK-NEXT: da analyze - none!
;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Separability.ll b/llvm/test/Analysis/DependenceAnalysis/Separability.ll
index 2ed9cca4d1fc0..18e5da407e1aa 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Separability.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Separability.ll
@@ -182,7 +182,7 @@ define void @sep2(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: store i32 %conv, ptr %arrayidx12, align 4
; CHECK-NEXT: da analyze - consistent output [0 S 0 0]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: %0 = load i32, ptr %arrayidx19, align 4
-; CHECK-NEXT: da analyze - flow [> * * -10]!
+; CHECK-NEXT: da analyze - flow [* * * <>]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: store i32 %0, ptr %B.addr.31, align 4
; CHECK-NEXT: da analyze - confused!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx19, align 4 --> Dst: %0 = load i32, ptr %arrayidx19, align 4
@@ -262,9 +262,9 @@ for.end28: ; preds = %for.inc26
define void @sep3(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
; CHECK-LABEL: 'sep3'
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: store i32 %conv, ptr %arrayidx13, align 4
-; CHECK-NEXT: da analyze - consistent output [0 S 0 0]!
+; CHECK-NEXT: da analyze - output [0 S 0 0]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: %0 = load i32, ptr %arrayidx20, align 4
-; CHECK-NEXT: da analyze - flow [> * * *]!
+; CHECK-NEXT: da analyze - flow [* * * *|<]!
; CHECK-NEXT: Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: store i32 %0, ptr %B.addr.31, align 4
; CHECK-NEXT: da analyze - confused!
; CHECK-NEXT: Src: %0 = load i32, ptr %arrayidx20, align 4 --> Dst: %0 = load i32, ptr %arrayidx20, align 4
diff --git a/llvm/test/Transforms/LoopInterchange/legality-check.ll b/llvm/test/Transforms/LoopInterchange/legality-check.ll
index c7f63d5968e62..fd0e08d3fae24 100644
--- a/llvm/test/Transforms/LoopInterchange/legality-check.ll
+++ b/llvm/test/Transforms/LoopInterchange/legality-check.ll
@@ -149,10 +149,10 @@ exit:
;; for (int k = 0; k < 19; k++)
;; b[i][j][k] = b[i][5][k + 1];
;;
-;; The direction vector of `b` is [= * <]. We cannot interchange all the loops.
+;; The direction vector of `b` is [= * *]. We cannot interchange all the loops.
; CHECK: Dependency matrix before interchange:
-; CHECK-NEXT: = * <
+; CHECK-NEXT: = * *
; CHECK-NEXT: Processing InnerLoopId = 2 and OuterLoopId = 1
; CHECK-NEXT: Failed interchange InnerLoopId = 2 and OuterLoopId = 1 due to dependence
; CHECK-NEXT: Not interchanging loops. Cannot prove legality.
diff --git a/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
index c17d78f7cfce6..a3a75ab6ec739 100644
--- a/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
+++ b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
@@ -1,6 +1,6 @@
-; RUN: opt < %s -passes=loop-interchange -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t \
-; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa
-; RUN: FileCheck --input-file=%t %s
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes=loop-interchange \
+; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa -S 2>&1 | FileCheck %s
;; The original code:
;;
@@ -15,19 +15,51 @@
;;
;; The entry of the direction vector for the outermost loop is `DVEntry::LE`.
;; We need to treat this as `*`, not `<`. See issue #123920 for details.
-
-; CHECK: --- !Missed
-; CHECK-NEXT: Pass: loop-interchange
-; CHECK-NEXT: Name: Dependence
-; CHECK-NEXT: Function: f
-; CHECK: --- !Missed
-; CHECK-NEXT: Pass: loop-interchange
-; CHECK-NEXT: Name: Dependence
-; CHECK-NEXT: Function: f
+;; In conclusion, we must not interchange the loops.
@a = dso_local global [16 x [16 x [16 x i32]]] zeroinitializer, align 4
define dso_local void @f() {
+; CHECK-LABEL: define dso_local void @f() {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: br label %[[FOR_COND1_PREHEADER:.*]]
+; CHECK: [[FOR_COND1_PREHEADER]]:
+; CHECK-NEXT: [[I_039:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC26:%.*]], %[[FOR_COND_CLEANUP3:.*]] ]
+; CHECK-NEXT: [[SUB:%.*]] = add nuw nsw i32 [[I_039]], 3
+; CHECK-NEXT: [[IDXPROM:%.*]] = zext nneg i32 [[SUB]] to i64
+; CHECK-NEXT: [[MUL:%.*]] = shl nuw nsw i32 [[I_039]], 1
+; CHECK-NEXT: [[IDXPROM13:%.*]] = zext nneg i32 [[MUL]] to i64
+; CHECK-NEXT: br label %[[FOR_COND5_PREHEADER:.*]]
+; CHECK: [[FOR_COND_CLEANUP:.*]]:
+; CHECK-NEXT: ret void
+; CHECK: [[FOR_COND5_PREHEADER]]:
+; CHECK-NEXT: [[J_038:%.*]] = phi i32 [ 1, %[[FOR_COND1_PREHEADER]] ], [ [[INC23:%.*]], %[[FOR_COND_CLEANUP7:.*]] ]
+; CHECK-NEXT: [[IDXPROM11:%.*]] = zext nneg i32 [[J_038]] to i64
+; CHECK-NEXT: [[SUB18:%.*]] = add nsw i32 [[J_038]], -1
+; CHECK-NEXT: [[IDXPROM19:%.*]] = sext i32 [[SUB18]] to i64
+; CHECK-NEXT: br label %[[FOR_BODY8:.*]]
+; CHECK: [[FOR_COND_CLEANUP3]]:
+; CHECK-NEXT: [[INC26]] = add nuw nsw i32 [[I_039]], 1
+; CHECK-NEXT: [[CMP:%.*]] = icmp samesign ult i32 [[I_039]], 3
+; CHECK-NEXT: br i1 [[CMP]], label %[[FOR_COND1_PREHEADER]], label %[[FOR_COND_CLEANUP]]
+; CHECK: [[FOR_COND_CLEANUP7]]:
+; CHECK-NEXT: [[INC23]] = add nuw nsw i32 [[J_038]], 1
+; CHECK-NEXT: [[CMP2:%.*]] = icmp samesign ult i32 [[J_038]], 7
+; CHECK-NEXT: br i1 [[CMP2]], label %[[FOR_COND5_PREHEADER]], label %[[FOR_COND_CLEANUP3]]
+; CHECK: [[FOR_BODY8]]:
+; CHECK-NEXT: [[K_037:%.*]] = phi i32 [ 1, %[[FOR_COND5_PREHEADER]] ], [ [[ADD15:%.*]], %[[FOR_BODY8]] ]
+; CHECK-NEXT: [[IDXPROM9:%.*]] = zext nneg i32 [[K_037]] to i64
+; CHECK-NEXT: [[ARRAYIDX12:%.*]] = getelementptr inbounds nuw [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 [[IDXPROM]], i64 [[IDXPROM9]], i64 [[IDXPROM11]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i32, ptr [[ARRAYIDX12]], align 4
+; CHECK-NEXT: [[ADD15]] = add nuw nsw i32 [[K_037]], 1
+; CHECK-NEXT: [[IDXPROM16:%.*]] = zext nneg i32 [[ADD15]] to i64
+; CHECK-NEXT: [[ARRAYIDX20:%.*]] = getelementptr inbounds [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 [[IDXPROM13]], i64 [[IDXPROM16]], i64 [[IDXPROM19]]
+; CHECK-NEXT: [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX20]], align 4
+; CHECK-NEXT: [[SUB21:%.*]] = sub nsw i32 [[TMP1]], [[TMP0]]
+; CHECK-NEXT: store i32 [[SUB21]], ptr [[ARRAYIDX20]], align 4
+; CHECK-NEXT: [[CMP6:%.*]] = icmp samesign ult i32 [[K_037]], 7
+; CHECK-NEXT: br i1 [[CMP6]], label %[[FOR_BODY8]], label %[[FOR_COND_CLEANUP7]]
+;
entry:
br label %for.cond1.preheader
diff --git a/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll b/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
index b95bbddf11d65..2867c6c0652e9 100644
--- a/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
+++ b/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
@@ -3,6 +3,9 @@
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
+; XFAIL: *
+; The transformation seems to have succeeded "accidentally".
+
; CHECK-LABEL: sub_sub_less
; CHECK: %j = phi
; CHECK-NOT: %j.1 = phi
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Looking at the code, looks weirtd that we even have delinearizeFixedSizeArray
and tryDelinearizeFixedSizeImpl
which derive the same information, just using different heuristics. Should have been one functions that applies any heuristic that applies.
; XFAIL: * | ||
; Currently fails since delinearization doesn't work as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what's the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the inconsistency in the estimated array sizes between rr[i][j]
and rr[j][j]
; the former is estimated as a 2D array (same as the pseudo code), while the latter is inferred as a 1D array (like rr[40*j + j]
). To address this, I think we need to consider the "compatibility" of the array sizes.
Added the comment to describe the above point.
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f | ||
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer the remark output. Check lines from update_test_checks give no clue what is intended to be checked
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code, looks weirtd that we even have
delinearizeFixedSizeArray
andtryDelinearizeFixedSizeImpl
which derive the same information, just using different heuristics. Should have been one functions that applies any heuristic that applies.
Correct, there are two functions which provide the same functionality. I think we need to remove tryDelinearizeFixedSizeImpl
(more precisely, getIndexExpressionsFromGEP
and the functions that depend on it) since it uses type information in GEPs to drive the heuristic. I believe tryDelinearizeFixedSizeImpl
can be easily replaced because there's only one other user (LoopCacheAnalysis), which I’m planning to do myself. Removing getIndexExpressionsFromGEP
is a bit tricky since Polly also uses it...
; XFAIL: * | ||
; Currently fails since delinearization doesn't work as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the inconsistency in the estimated array sizes between rr[i][j]
and rr[j][j]
; the former is estimated as a 2D array (same as the pseudo code), while the latter is inferred as a 1D array (like rr[40*j + j]
). To address this, I think we need to consider the "compatibility" of the array sizes.
Added the comment to describe the above point.
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f | ||
; CHECK: --- !Missed | ||
; CHECK-NEXT: Pass: loop-interchange | ||
; CHECK-NEXT: Name: Dependence | ||
; CHECK-NEXT: Function: f |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted. The output has changed, but it should be fine since this test ensures that no interchange happens.
I can take care of that. The work there is that 49 tests assume GEP delinierization |
Great, thanks. The function corresponding to |
That was just the approach taken before delineraization was introduced. The math behind Polly does not particularly need delinearization for fixed-size arrays, but makes it easier to think about test cases if the analysis looks like the code was written. At the time there was nothing speaking against using GEPs if available. It also may allow excluding some aliasing due to range limits. E.g. |
This patch replaces the delinearization function used in DA, switching from one that depends on type information in GEPs to one that does not. There are three types of changes in regression tests: improvements, degradations, and degradations that the related features will be removed. Since there were very few cases that are classified into the second category, I believe the impact of this change should be practically insignificant.