Skip to content

Conversation

kasuga-fj
Copy link
Contributor

@kasuga-fj kasuga-fj commented Oct 3, 2025

This patch replaces the delinearization function used in DA, switching from one that depends on type information in GEPs to one that does not. There are three types of changes in regression tests: improvements, degradations, and degradations that the related features will be removed. Since there were very few cases that are classified into the second category, I believe the impact of this change should be practically insignificant.

Copy link
Contributor Author

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the test result changes, and I believe it's probably ready to be merged. (It looks like the results have improved in some cases...)

Comment on lines +152 to +155
;; The direction vector of `b` is [= * *]. We cannot interchange all the loops.

; CHECK: Dependency matrix before interchange:
; CHECK-NEXT: = * <
; CHECK-NEXT: = * *
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember I added this test in #139690. It should be sufficient to ensure no interchange occurs for this case.

Comment on lines 19 to 26
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in #124901. Also enough to check no interchange happens, so I switched from checking remarks to using update_test_checks.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the remark output. Check lines from update_test_checks give no clue what is intended to be checked

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted. The output has changed, but it should be fine since this test ensures that no interchange happens.

Comment on lines +6 to +8
; XFAIL: *
; The transformation seems to have succeeded "accidentally".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe conflicting with #156578. I will rebase after that PR is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this test will be removed in #160924.

Comment on lines -152 to -156
; CHECK-NEXT: da analyze - consistent output [0 0 0|<]!
; CHECK-NEXT: da analyze - output [<= * *|<]!
; CHECK-NEXT: Src: store i32 1, ptr %idx1, align 4 --> Dst: store i32 1, ptr %idx1, align 4
; CHECK-NEXT: da analyze - none!
;
; FIXME: the dependence distance is not constant. Distance vector should be [* * *|<]!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change means fixing a known issue.

; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %i.013, ptr %arrayidx12, align 8
; CHECK-NEXT: da analyze - none!
; CHECK-NEXT: Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8
; CHECK-NEXT: da analyze - output [-4 -3]!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems regression, but less important, I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also will be removed in #160924.

@kasuga-fj kasuga-fj marked this pull request as ready for review October 14, 2025 12:17
@kasuga-fj kasuga-fj force-pushed the da-replace-delinearize branch from 51ce079 to c7dc3af Compare October 14, 2025 12:41
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Oct 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 14, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Ryotaro Kasuga (kasuga-fj)

Changes

Replace the delinearization that depends on type information in GEPs with what doesn't rely on it.

This is WIP as the failing tests need to be updated.


Full diff: https://github.com/llvm/llvm-project/pull/161822.diff

13 Files Affected:

  • (modified) llvm/lib/Analysis/DependenceAnalysis.cpp (+16-18)
  • (modified) llvm/test/Analysis/DependenceAnalysis/Banerjee.ll (+2-2)
  • (modified) llvm/test/Analysis/DependenceAnalysis/Coupled.ll (+2-4)
  • (modified) llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll (+1-2)
  • (modified) llvm/test/Analysis/DependenceAnalysis/Invariant.ll (+3)
  • (modified) llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll (+3-1)
  • (modified) llvm/test/Analysis/DependenceAnalysis/PR51512.ll (+1-1)
  • (modified) llvm/test/Analysis/DependenceAnalysis/Propagating.ll (+2-1)
  • (modified) llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll (+1-1)
  • (modified) llvm/test/Analysis/DependenceAnalysis/Separability.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopInterchange/legality-check.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll (+44-12)
  • (modified) llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll (+3)
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index 8d20b0e10305b..afc4cc9fdd516 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -3504,12 +3504,13 @@ bool DependenceInfo::tryDelinearizeFixedSize(
            "expected src and dst scev unknowns to be equal");
   });
 
-  SmallVector<int, 4> SrcSizes;
-  SmallVector<int, 4> DstSizes;
-  if (!tryDelinearizeFixedSizeImpl(SE, Src, SrcAccessFn, SrcSubscripts,
-                                   SrcSizes) ||
-      !tryDelinearizeFixedSizeImpl(SE, Dst, DstAccessFn, DstSubscripts,
-                                   DstSizes))
+  const SCEV *ElemSize = SE->getElementSize(Src);
+  assert(ElemSize == SE->getElementSize(Dst) && "Different element sizes");
+  SmallVector<const SCEV *, 4> SrcSizes, DstSizes;
+  if (!delinearizeFixedSizeArray(*SE, SE->removePointerBase(SrcAccessFn),
+                                 SrcSubscripts, SrcSizes, ElemSize) ||
+      !delinearizeFixedSizeArray(*SE, SE->removePointerBase(DstAccessFn),
+                                 DstSubscripts, DstSizes, ElemSize))
     return false;
 
   // Check that the two size arrays are non-empty and equal in length and
@@ -3535,7 +3536,7 @@ bool DependenceInfo::tryDelinearizeFixedSize(
   // iff the subscripts are positive and are less than the range of the
   // dimension.
   if (!DisableDelinearizationChecks) {
-    auto AllIndicesInRange = [&](SmallVector<int, 4> &DimensionSizes,
+    auto AllIndicesInRange = [&](ArrayRef<const SCEV *> DimensionSizes,
                                  SmallVectorImpl<const SCEV *> &Subscripts,
                                  Value *Ptr) {
       size_t SSize = Subscripts.size();
@@ -3548,17 +3549,14 @@ bool DependenceInfo::tryDelinearizeFixedSize(
           });
           return false;
         }
-        if (auto *SType = dyn_cast<IntegerType>(S->getType())) {
-          const SCEV *Range = SE->getConstant(
-              ConstantInt::get(SType, DimensionSizes[I - 1], false));
-          if (!isKnownLessThan(S, Range)) {
-            LLVM_DEBUG({
-              dbgs() << "Check failed: !isKnownLessThan(S, Range)\n";
-              dbgs() << "  S: " << *S << "\n"
-                     << "  Range: " << *Range << "\n";
-            });
-            return false;
-          }
+        const SCEV *Range = DimensionSizes[I - 1];
+        if (!isKnownLessThan(S, Range)) {
+          LLVM_DEBUG({
+            dbgs() << "Check failed: !isKnownLessThan(S, Range)\n";
+            dbgs() << "  S: " << *S << "\n"
+                   << "  Range: " << *Range << "\n";
+          });
+          return false;
         }
       }
       return true;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll b/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
index e0def901d1759..6dde8844c6040 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
@@ -660,7 +660,7 @@ define void @banerjee7(ptr %A, ptr %B, i64 %m, i64 %n) nounwind uwtable ssp {
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 0, ptr %arrayidx, align 8
 ; DELIN-NEXT:    da analyze - none!
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: %0 = load i64, ptr %arrayidx7, align 8
-; DELIN-NEXT:    da analyze - flow [> <=]!
+; DELIN-NEXT:    da analyze - consistent flow [-1 0]!
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 %0, ptr %B.addr.11, align 8
 ; DELIN-NEXT:    da analyze - confused!
 ; DELIN-NEXT:  Src: %0 = load i64, ptr %arrayidx7, align 8 --> Dst: %0 = load i64, ptr %arrayidx7, align 8
@@ -916,7 +916,7 @@ define void @banerjee10(ptr %A, ptr %B, i64 %m, i64 %n) nounwind uwtable ssp {
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 0, ptr %arrayidx, align 8
 ; DELIN-NEXT:    da analyze - none!
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: %1 = load i64, ptr %arrayidx6, align 8
-; DELIN-NEXT:    da analyze - flow [<> 0]!
+; DELIN-NEXT:    da analyze - flow [-11 0]!
 ; DELIN-NEXT:  Src: store i64 0, ptr %arrayidx, align 8 --> Dst: store i64 %1, ptr %B.addr.11, align 8
 ; DELIN-NEXT:    da analyze - confused!
 ; DELIN-NEXT:  Src: %1 = load i64, ptr %arrayidx6, align 8 --> Dst: %1 = load i64, ptr %arrayidx6, align 8
diff --git a/llvm/test/Analysis/DependenceAnalysis/Coupled.ll b/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
index 1d4513429a83c..01c84c77393bc 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Coupled.ll
@@ -503,8 +503,7 @@ define void @couple11(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %conv, ptr %arrayidx2, align 4
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
-; CHECK-NEXT:    da analyze - flow [0|<] splitable!
-; CHECK-NEXT:    da analyze - split level = 1, iteration = 9!
+; CHECK-NEXT:    da analyze - flow [0|<]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %0, ptr %B.addr.01, align 4
 ; CHECK-NEXT:    da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load i32, ptr %arrayidx4, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
@@ -548,8 +547,7 @@ define void @couple12(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %conv, ptr %arrayidx2, align 4
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
-; CHECK-NEXT:    da analyze - flow [<] splitable!
-; CHECK-NEXT:    da analyze - split level = 1, iteration = 11!
+; CHECK-NEXT:    da analyze - flow [<]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx2, align 4 --> Dst: store i32 %0, ptr %B.addr.01, align 4
 ; CHECK-NEXT:    da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load i32, ptr %arrayidx4, align 4 --> Dst: %0 = load i32, ptr %arrayidx4, align 4
diff --git a/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll b/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
index d9ccea55dd478..069a540ea0295 100644
--- a/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/DifferentOffsets.ll
@@ -149,11 +149,10 @@ define void @multidim_accesses(ptr %A) {
 ; CHECK-NEXT:  Src: store i32 1, ptr %idx0, align 4 --> Dst: store i32 1, ptr %idx0, align 4
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: store i32 1, ptr %idx0, align 4 --> Dst: store i32 1, ptr %idx1, align 4
-; CHECK-NEXT:    da analyze - consistent output [0 0 0|<]!
+; CHECK-NEXT:    da analyze - output [<= * *|<]!
 ; CHECK-NEXT:  Src: store i32 1, ptr %idx1, align 4 --> Dst: store i32 1, ptr %idx1, align 4
 ; CHECK-NEXT:    da analyze - none!
 ;
-; FIXME: the dependence distance is not constant. Distance vector should be [* * *|<]!
 ; for (i = 0; i < 256; i++)
 ;   for (j = 0; j < 256; j++)
 ;      for (k = 0; k < 256; k++) {
diff --git a/llvm/test/Analysis/DependenceAnalysis/Invariant.ll b/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
index 1d8c51e475ae8..8a9d0a82eb27b 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Invariant.ll
@@ -2,6 +2,9 @@
 ; RUN: opt < %s -disable-output "-passes=print<da>" -aa-pipeline=basic-aa 2>&1 \
 ; RUN: | FileCheck %s
 
+; XFAIL: *
+; Currently fails since delinearization doesn't work as expected.
+
 ; Test for a bug, which caused an assert when an invalid
 ; SCEVAddRecExpr is created in addToCoefficient.
 
diff --git a/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll b/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
index e5d5d21e365a1..eb832747e366e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/NonCanonicalizedSubscript.ll
@@ -47,6 +47,8 @@ for.end:
 ;    }
 ;  }
 ;  Extends the previous example to coupled MIV subscripts.
+;
+; FIXME: Currently delinearization does not work as expected.
 
 
 @a = global [10004 x [10004 x i32]] zeroinitializer, align 16
@@ -57,7 +59,7 @@ define void @coupled_miv_type_mismatch(i32 %n) #0 {
 ; CHECK-NEXT:  Src: %2 = load i32, ptr %arrayidx5, align 4 --> Dst: %2 = load i32, ptr %arrayidx5, align 4
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: %2 = load i32, ptr %arrayidx5, align 4 --> Dst: store i32 %add6, ptr %arrayidx10, align 4
-; CHECK-NEXT:    da analyze - consistent anti [1 -2]!
+; CHECK-NEXT:    da analyze - anti [< >]!
 ; CHECK-NEXT:  Src: store i32 %add6, ptr %arrayidx10, align 4 --> Dst: store i32 %add6, ptr %arrayidx10, align 4
 ; CHECK-NEXT:    da analyze - none!
 ;
diff --git a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
index 9bee38c6c00ef..2d1638d145ffe 100644
--- a/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/PR51512.ll
@@ -10,7 +10,7 @@ define void @foo() {
 ; CHECK-NEXT:  Src: store i32 42, ptr %getelementptr, align 1 --> Dst: store i32 42, ptr %getelementptr, align 1
 ; CHECK-NEXT:    da analyze - consistent output [0 S]!
 ; CHECK-NEXT:  Src: store i32 42, ptr %getelementptr, align 1 --> Dst: store i32 0, ptr %getelementptr5, align 1
-; CHECK-NEXT:    da analyze - output [0 *|<]!
+; CHECK-NEXT:    da analyze - output [0 <=|<]!
 ; CHECK-NEXT:  Src: store i32 0, ptr %getelementptr5, align 1 --> Dst: store i32 0, ptr %getelementptr5, align 1
 ; CHECK-NEXT:    da analyze - none!
 ;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Propagating.ll b/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
index 866f515baeafb..09598f43c7c7d 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Propagating.ll
@@ -437,7 +437,8 @@ define void @prop7(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: store i32 %conv, ptr %arrayidx7, align 4
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: %0 = load i32, ptr %arrayidx13, align 4
-; CHECK-NEXT:    da analyze - flow [* <>]!
+; CHECK-NEXT:    da analyze - flow [* -38] splitable!
+; CHECK-NEXT:    da analyze - split level = 1, iteration = 4!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx7, align 4 --> Dst: store i32 %0, ptr %B.addr.11, align 4
 ; CHECK-NEXT:    da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load i32, ptr %arrayidx13, align 4 --> Dst: %0 = load i32, ptr %arrayidx13, align 4
diff --git a/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll b/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
index 57962e01de2b4..5a51c748a344e 100644
--- a/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/SameSDLoops.ll
@@ -148,7 +148,7 @@ define void @non_samebd0(ptr %A) nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %i.013, ptr %arrayidx12, align 8
 ; CHECK-NEXT:    da analyze - none!
 ; CHECK-NEXT:  Src: store i64 %i.013, ptr %arrayidx12, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8
-; CHECK-NEXT:    da analyze - output [-4 -3]!
+; CHECK-NEXT:    da analyze - output [> *]!
 ; CHECK-NEXT:  Src: store i64 %l17.04, ptr %arrayidx24, align 8 --> Dst: store i64 %l17.04, ptr %arrayidx24, align 8
 ; CHECK-NEXT:    da analyze - none!
 ;
diff --git a/llvm/test/Analysis/DependenceAnalysis/Separability.ll b/llvm/test/Analysis/DependenceAnalysis/Separability.ll
index 2ed9cca4d1fc0..18e5da407e1aa 100644
--- a/llvm/test/Analysis/DependenceAnalysis/Separability.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/Separability.ll
@@ -182,7 +182,7 @@ define void @sep2(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: store i32 %conv, ptr %arrayidx12, align 4
 ; CHECK-NEXT:    da analyze - consistent output [0 S 0 0]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: %0 = load i32, ptr %arrayidx19, align 4
-; CHECK-NEXT:    da analyze - flow [> * * -10]!
+; CHECK-NEXT:    da analyze - flow [* * * <>]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx12, align 4 --> Dst: store i32 %0, ptr %B.addr.31, align 4
 ; CHECK-NEXT:    da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load i32, ptr %arrayidx19, align 4 --> Dst: %0 = load i32, ptr %arrayidx19, align 4
@@ -262,9 +262,9 @@ for.end28:                                        ; preds = %for.inc26
 define void @sep3(ptr %A, ptr %B, i32 %n) nounwind uwtable ssp {
 ; CHECK-LABEL: 'sep3'
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: store i32 %conv, ptr %arrayidx13, align 4
-; CHECK-NEXT:    da analyze - consistent output [0 S 0 0]!
+; CHECK-NEXT:    da analyze - output [0 S 0 0]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: %0 = load i32, ptr %arrayidx20, align 4
-; CHECK-NEXT:    da analyze - flow [> * * *]!
+; CHECK-NEXT:    da analyze - flow [* * * *|<]!
 ; CHECK-NEXT:  Src: store i32 %conv, ptr %arrayidx13, align 4 --> Dst: store i32 %0, ptr %B.addr.31, align 4
 ; CHECK-NEXT:    da analyze - confused!
 ; CHECK-NEXT:  Src: %0 = load i32, ptr %arrayidx20, align 4 --> Dst: %0 = load i32, ptr %arrayidx20, align 4
diff --git a/llvm/test/Transforms/LoopInterchange/legality-check.ll b/llvm/test/Transforms/LoopInterchange/legality-check.ll
index c7f63d5968e62..fd0e08d3fae24 100644
--- a/llvm/test/Transforms/LoopInterchange/legality-check.ll
+++ b/llvm/test/Transforms/LoopInterchange/legality-check.ll
@@ -149,10 +149,10 @@ exit:
 ;;      for (int k = 0; k < 19; k++)
 ;;        b[i][j][k] = b[i][5][k + 1];
 ;;
-;; The direction vector of `b` is [= * <]. We cannot interchange all the loops.
+;; The direction vector of `b` is [= * *]. We cannot interchange all the loops.
 
 ; CHECK:      Dependency matrix before interchange:
-; CHECK-NEXT: = * <
+; CHECK-NEXT: = * *
 ; CHECK-NEXT: Processing InnerLoopId = 2 and OuterLoopId = 1
 ; CHECK-NEXT: Failed interchange InnerLoopId = 2 and OuterLoopId = 1 due to dependence
 ; CHECK-NEXT: Not interchanging loops. Cannot prove legality.
diff --git a/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
index c17d78f7cfce6..a3a75ab6ec739 100644
--- a/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
+++ b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
@@ -1,6 +1,6 @@
-; RUN: opt < %s -passes=loop-interchange -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t \
-; RUN:     -verify-dom-info -verify-loop-info -verify-loop-lcssa
-; RUN: FileCheck --input-file=%t %s
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -passes=loop-interchange \
+; RUN:     -verify-dom-info -verify-loop-info -verify-loop-lcssa -S 2>&1 | FileCheck %s
 
 ;; The original code:
 ;;
@@ -15,19 +15,51 @@
 ;;
 ;; The entry of the direction vector for the outermost loop is `DVEntry::LE`.
 ;; We need to treat this as `*`, not `<`. See issue #123920 for details.
-
-; CHECK: --- !Missed
-; CHECK-NEXT: Pass:            loop-interchange
-; CHECK-NEXT: Name:            Dependence
-; CHECK-NEXT: Function:        f
-; CHECK: --- !Missed
-; CHECK-NEXT: Pass:            loop-interchange
-; CHECK-NEXT: Name:            Dependence
-; CHECK-NEXT: Function:        f
+;; In conclusion, we must not interchange the loops.
 
 @a = dso_local global [16 x [16 x [16 x i32]]] zeroinitializer, align 4
 
 define dso_local void @f() {
+; CHECK-LABEL: define dso_local void @f() {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    br label %[[FOR_COND1_PREHEADER:.*]]
+; CHECK:       [[FOR_COND1_PREHEADER]]:
+; CHECK-NEXT:    [[I_039:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[INC26:%.*]], %[[FOR_COND_CLEANUP3:.*]] ]
+; CHECK-NEXT:    [[SUB:%.*]] = add nuw nsw i32 [[I_039]], 3
+; CHECK-NEXT:    [[IDXPROM:%.*]] = zext nneg i32 [[SUB]] to i64
+; CHECK-NEXT:    [[MUL:%.*]] = shl nuw nsw i32 [[I_039]], 1
+; CHECK-NEXT:    [[IDXPROM13:%.*]] = zext nneg i32 [[MUL]] to i64
+; CHECK-NEXT:    br label %[[FOR_COND5_PREHEADER:.*]]
+; CHECK:       [[FOR_COND_CLEANUP:.*]]:
+; CHECK-NEXT:    ret void
+; CHECK:       [[FOR_COND5_PREHEADER]]:
+; CHECK-NEXT:    [[J_038:%.*]] = phi i32 [ 1, %[[FOR_COND1_PREHEADER]] ], [ [[INC23:%.*]], %[[FOR_COND_CLEANUP7:.*]] ]
+; CHECK-NEXT:    [[IDXPROM11:%.*]] = zext nneg i32 [[J_038]] to i64
+; CHECK-NEXT:    [[SUB18:%.*]] = add nsw i32 [[J_038]], -1
+; CHECK-NEXT:    [[IDXPROM19:%.*]] = sext i32 [[SUB18]] to i64
+; CHECK-NEXT:    br label %[[FOR_BODY8:.*]]
+; CHECK:       [[FOR_COND_CLEANUP3]]:
+; CHECK-NEXT:    [[INC26]] = add nuw nsw i32 [[I_039]], 1
+; CHECK-NEXT:    [[CMP:%.*]] = icmp samesign ult i32 [[I_039]], 3
+; CHECK-NEXT:    br i1 [[CMP]], label %[[FOR_COND1_PREHEADER]], label %[[FOR_COND_CLEANUP]]
+; CHECK:       [[FOR_COND_CLEANUP7]]:
+; CHECK-NEXT:    [[INC23]] = add nuw nsw i32 [[J_038]], 1
+; CHECK-NEXT:    [[CMP2:%.*]] = icmp samesign ult i32 [[J_038]], 7
+; CHECK-NEXT:    br i1 [[CMP2]], label %[[FOR_COND5_PREHEADER]], label %[[FOR_COND_CLEANUP3]]
+; CHECK:       [[FOR_BODY8]]:
+; CHECK-NEXT:    [[K_037:%.*]] = phi i32 [ 1, %[[FOR_COND5_PREHEADER]] ], [ [[ADD15:%.*]], %[[FOR_BODY8]] ]
+; CHECK-NEXT:    [[IDXPROM9:%.*]] = zext nneg i32 [[K_037]] to i64
+; CHECK-NEXT:    [[ARRAYIDX12:%.*]] = getelementptr inbounds nuw [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 [[IDXPROM]], i64 [[IDXPROM9]], i64 [[IDXPROM11]]
+; CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[ARRAYIDX12]], align 4
+; CHECK-NEXT:    [[ADD15]] = add nuw nsw i32 [[K_037]], 1
+; CHECK-NEXT:    [[IDXPROM16:%.*]] = zext nneg i32 [[ADD15]] to i64
+; CHECK-NEXT:    [[ARRAYIDX20:%.*]] = getelementptr inbounds [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 [[IDXPROM13]], i64 [[IDXPROM16]], i64 [[IDXPROM19]]
+; CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX20]], align 4
+; CHECK-NEXT:    [[SUB21:%.*]] = sub nsw i32 [[TMP1]], [[TMP0]]
+; CHECK-NEXT:    store i32 [[SUB21]], ptr [[ARRAYIDX20]], align 4
+; CHECK-NEXT:    [[CMP6:%.*]] = icmp samesign ult i32 [[K_037]], 7
+; CHECK-NEXT:    br i1 [[CMP6]], label %[[FOR_BODY8]], label %[[FOR_COND_CLEANUP7]]
+;
 entry:
   br label %for.cond1.preheader
 
diff --git a/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll b/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
index b95bbddf11d65..2867c6c0652e9 100644
--- a/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
+++ b/llvm/test/Transforms/LoopUnrollAndJam/dependencies_multidims.ll
@@ -3,6 +3,9 @@
 
 target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
 
+; XFAIL: *
+; The transformation seems to have succeeded "accidentally".
+
 ; CHECK-LABEL: sub_sub_less
 ; CHECK: %j = phi
 ; CHECK-NOT: %j.1 = phi

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Looking at the code, looks weirtd that we even have delinearizeFixedSizeArray and tryDelinearizeFixedSizeImpl which derive the same information, just using different heuristics. Should have been one functions that applies any heuristic that applies.

Comment on lines 5 to 6
; XFAIL: *
; Currently fails since delinearization doesn't work as expected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what's the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the inconsistency in the estimated array sizes between rr[i][j] and rr[j][j]; the former is estimated as a 2D array (same as the pseudo code), while the latter is inferred as a 1D array (like rr[40*j + j]). To address this, I think we need to consider the "compatibility" of the array sizes.

Added the comment to describe the above point.

Comment on lines 19 to 26
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the remark output. Check lines from update_test_checks give no clue what is intended to be checked

Copy link
Contributor Author

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code, looks weirtd that we even have delinearizeFixedSizeArray and tryDelinearizeFixedSizeImpl which derive the same information, just using different heuristics. Should have been one functions that applies any heuristic that applies.

Correct, there are two functions which provide the same functionality. I think we need to remove tryDelinearizeFixedSizeImpl (more precisely, getIndexExpressionsFromGEP and the functions that depend on it) since it uses type information in GEPs to drive the heuristic. I believe tryDelinearizeFixedSizeImpl can be easily replaced because there's only one other user (LoopCacheAnalysis), which I’m planning to do myself. Removing getIndexExpressionsFromGEP is a bit tricky since Polly also uses it...

Comment on lines 5 to 6
; XFAIL: *
; Currently fails since delinearization doesn't work as expected.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the inconsistency in the estimated array sizes between rr[i][j] and rr[j][j]; the former is estimated as a 2D array (same as the pseudo code), while the latter is inferred as a 1D array (like rr[40*j + j]). To address this, I think we need to consider the "compatibility" of the array sizes.

Added the comment to describe the above point.

Comment on lines 19 to 26
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
; CHECK: --- !Missed
; CHECK-NEXT: Pass: loop-interchange
; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Function: f
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted. The output has changed, but it should be fine since this test ensures that no interchange happens.

@Meinersbur
Copy link
Member

Removing getIndexExpressionsFromGEP is a bit tricky since Polly also uses it...

I can take care of that. The work there is that 49 tests assume GEP delinierization

@kasuga-fj
Copy link
Contributor Author

Removing getIndexExpressionsFromGEP is a bit tricky since Polly also uses it...

I can take care of that. The work there is that 49 tests assume GEP delinierization

Great, thanks. The function corresponding to getIndexExpressionsFromGEP would be findFixedSizeArrayDimensions, which I previously added to eliminate array size estimation based on GEP.
JFYI about findFixedSizeArrayDimensions: I'm not familiar with the details of Polly, but this function is fairly minimal and may not be sufficient for Polly's use cases. As far as I can tell, it doesn't handle triangular loops well. Additionally, cases where loop ranges depend on the induction variables of outer loops likely won't work either.

@Meinersbur
Copy link
Member

That was just the approach taken before delineraization was introduced. The math behind Polly does not particularly need delinearization for fixed-size arrays, but makes it easier to think about test cases if the analysis looks like the code was written. At the time there was nothing speaking against using GEPs if available. It also may allow excluding some aliasing due to range limits. E.g. A[1024*i + e] may alias with e==-1, i==1 in one iteration and e'==1023, i'==0 in another. This cannot happen with A[i][e], where 0 <= e < 1024 is verified by either the linear program solver or a runtime check. I think the impact of not having that is rather minor, e would either be invariant or range-limited by being an induction variable anyway. ISL can handle solving the values of i,e,i',e' under which $1024i + e = 1024i' + e'$ just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants