Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
294 changes: 291 additions & 3 deletions llvm/lib/Analysis/DependenceAnalysis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,18 @@ static cl::opt<bool> RunSIVRoutinesOnly(
"The purpose is mainly to exclude the influence of those routines "
"in regression tests for SIV routines."));

// TODO: This flag is disabled by default because it is still under development.
// Enable it or delete this flag when the feature is ready.
static cl::opt<bool> EnableMonotonicityCheck(
"da-enable-monotonicity-check", cl::init(false), cl::Hidden,
cl::desc("Check if the subscripts are monotonic. If it's not, dependence "
"is reported as unknown."));

static cl::opt<bool> DumpMonotonicityReport(
"da-dump-monotonicity-report", cl::init(false), cl::Hidden,
cl::desc(
"When printing analysis, dump the results of monotonicity checks."));

//===----------------------------------------------------------------------===//
// basics

Expand Down Expand Up @@ -177,13 +189,196 @@ void DependenceAnalysisWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequiredTransitive<LoopInfoWrapperPass>();
}

namespace {

/// The property of monotonicity of a SCEV. To define the monotonicity, assume
/// a SCEV defined within N-nested loops. Let i_k denote the iteration number
/// of the k-th loop. Then we can regard the SCEV as an N-ary function:
///
/// F(i_1, i_2, ..., i_N)
///
/// The domain of i_k is the closed range [0, BTC_k], where BTC_k is the
/// backedge-taken count of the k-th loop.
///
/// A function F is said to be "monotonically increasing with respect to the
/// k-th loop" if x <= y implies the following condition:
///
/// F(i_1, ..., i_{k-1}, x, i_{k+1}, ..., i_N) <=
/// F(i_1, ..., i_{k-1}, y, i_{k+1}, ..., i_N)
///
/// where i_1, ..., i_{k-1}, i_{k+1}, ..., i_N, x, and y are elements of their
/// respective domains.
///
/// Likewise F is "monotonically decreasing with respect to the k-th loop"
/// if x <= y implies
///
/// F(i_1, ..., i_{k-1}, x, i_{k+1}, ..., i_N) >=
/// F(i_1, ..., i_{k-1}, y, i_{k+1}, ..., i_N)
///
/// A function F that is monotonically increasing or decreasing with respect to
/// the k-th loop is simply called "monotonic with respect to k-th loop".
///
/// A function F is said to be "multivariate monotonic" when it is monotonic
/// with respect to all of the N loops.
///
/// Since integer comparison can be either signed or unsigned, we need to
/// distinguish monotonicity in the signed sense from that in the unsigned
/// sense. Note that the inequality "x <= y" merely indicates loop progression
/// and is not affected by the difference between signed and unsigned order.
///
/// Currently we only consider monotonicity in a signed sense.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of things have been discussed in the comments. I need to catch up on some, but that is my problem. What I was going to ask is this. I really like the clear descriptions so far, but can we add, or is it worth explaining a little bit more, the algorithm that determines monotonicity? That will be a high level description of course, but I feel explaining the concepts of looking at AddRec and the nowrap flags etc. is missing a little bit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments.

enum class SCEVMonotonicityType {
/// We don't know anything about the monotonicity of the SCEV.
Unknown,

/// The SCEV is loop-invariant with respect to the outermost loop. In other
/// words, the function F corresponding to the SCEV is a constant function.
Invariant,

/// The function F corresponding to the SCEV is multivariate monotonic in a
/// signed sense. Note that the multivariate monotonic function may also be a
/// constant function. The order employed in the definition of monotonicity
/// is not strict order.
MultivariateSignedMonotonic,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used "multimonotonic" in the description, but "Multivariate monotonic" here. Consistency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unified to "multivariate monotonic".

};

struct SCEVMonotonicity {
SCEVMonotonicity(SCEVMonotonicityType Type,
const SCEV *FailurePoint = nullptr);

SCEVMonotonicityType getType() const { return Type; }

const SCEV *getFailurePoint() const { return FailurePoint; }

bool isUnknown() const { return Type == SCEVMonotonicityType::Unknown; }

void print(raw_ostream &OS, unsigned Depth) const;

private:
SCEVMonotonicityType Type;

/// The subexpression that caused Unknown. Mainly for debugging purpose.
const SCEV *FailurePoint;
};

/// Check the monotonicity of a SCEV. Since dependence tests (SIV, MIV, etc.)
/// assume that subscript expressions are (multivariate) monotonic, we need to
/// verify this property before applying those tests. Violating this assumption
/// may cause them to produce incorrect results.
struct SCEVMonotonicityChecker
: public SCEVVisitor<SCEVMonotonicityChecker, SCEVMonotonicity> {
Comment on lines +268 to +269
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the testability, maybe is it better to split the file, like ScalarEvolutionDivision.cpp? Or would it be better to avoid creating separate files unnecessarily?


SCEVMonotonicityChecker(ScalarEvolution *SE) : SE(SE) {}

/// Check the monotonicity of \p Expr. \p Expr must be integer type. If \p
/// OutermostLoop is not null, \p Expr must be defined in \p OutermostLoop or
/// one of its nested loops.
SCEVMonotonicity checkMonotonicity(const SCEV *Expr,
const Loop *OutermostLoop);

private:
ScalarEvolution *SE;

/// The outermost loop that DA is analyzing.
const Loop *OutermostLoop;

/// A helper to classify \p Expr as either Invariant or Unknown.
SCEVMonotonicity invariantOrUnknown(const SCEV *Expr);

/// Return true if \p Expr is loop-invariant with respect to the outermost
/// loop.
bool isLoopInvariant(const SCEV *Expr) const;

/// A helper to create an Unknown SCEVMonotonicity.
SCEVMonotonicity createUnknown(const SCEV *FailurePoint) {
return SCEVMonotonicity(SCEVMonotonicityType::Unknown, FailurePoint);
}

SCEVMonotonicity visitAddRecExpr(const SCEVAddRecExpr *Expr);

SCEVMonotonicity visitConstant(const SCEVConstant *) {
return SCEVMonotonicity(SCEVMonotonicityType::Invariant);
}
SCEVMonotonicity visitVScale(const SCEVVScale *) {
return SCEVMonotonicity(SCEVMonotonicityType::Invariant);
}

// TODO: Handle more cases.
SCEVMonotonicity visitZeroExtendExpr(const SCEVZeroExtendExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitSignExtendExpr(const SCEVSignExtendExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitAddExpr(const SCEVAddExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitMulExpr(const SCEVMulExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitPtrToIntExpr(const SCEVPtrToIntExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitTruncateExpr(const SCEVTruncateExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitUDivExpr(const SCEVUDivExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitSMaxExpr(const SCEVSMaxExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitUMaxExpr(const SCEVUMaxExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitSMinExpr(const SCEVSMinExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitUMinExpr(const SCEVUMinExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitSequentialUMinExpr(const SCEVSequentialUMinExpr *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitUnknown(const SCEVUnknown *Expr) {
return invariantOrUnknown(Expr);
}
SCEVMonotonicity visitCouldNotCompute(const SCEVCouldNotCompute *Expr) {
return invariantOrUnknown(Expr);
}

friend struct SCEVVisitor<SCEVMonotonicityChecker, SCEVMonotonicity>;
};

} // anonymous namespace

// Used to test the dependence analyzer.
// Looks through the function, noting instructions that may access memory.
// Calls depends() on every possible pair and prints out the result.
// Ignores all other instructions.
static void dumpExampleDependence(raw_ostream &OS, DependenceInfo *DA,
ScalarEvolution &SE, bool NormalizeResults) {
ScalarEvolution &SE, LoopInfo &LI,
bool NormalizeResults) {
auto *F = DA->getFunction();

if (DumpMonotonicityReport) {
SCEVMonotonicityChecker Checker(&SE);
OS << "Monotonicity check:\n";
for (Instruction &Inst : instructions(F)) {
if (!isa<LoadInst>(Inst) && !isa<StoreInst>(Inst))
continue;
Value *Ptr = getLoadStorePointerOperand(&Inst);
const Loop *L = LI.getLoopFor(Inst.getParent());
const SCEV *PtrSCEV = SE.getSCEVAtScope(Ptr, L);
const SCEV *AccessFn = SE.removePointerBase(PtrSCEV);
SCEVMonotonicity Mon = Checker.checkMonotonicity(AccessFn, L);
OS.indent(2) << "Inst: " << Inst << "\n";
OS.indent(4) << "Expr: " << *AccessFn << "\n";
Mon.print(OS, 4);
}
OS << "\n";
}

for (inst_iterator SrcI = inst_begin(F), SrcE = inst_end(F); SrcI != SrcE;
++SrcI) {
if (SrcI->mayReadOrWriteMemory()) {
Expand Down Expand Up @@ -235,7 +430,8 @@ static void dumpExampleDependence(raw_ostream &OS, DependenceInfo *DA,
void DependenceAnalysisWrapperPass::print(raw_ostream &OS,
const Module *) const {
dumpExampleDependence(
OS, info.get(), getAnalysis<ScalarEvolutionWrapperPass>().getSE(), false);
OS, info.get(), getAnalysis<ScalarEvolutionWrapperPass>().getSE(),
getAnalysis<LoopInfoWrapperPass>().getLoopInfo(), false);
}

PreservedAnalyses
Expand All @@ -244,7 +440,7 @@ DependenceAnalysisPrinterPass::run(Function &F, FunctionAnalysisManager &FAM) {
<< "':\n";
dumpExampleDependence(OS, &FAM.getResult<DependenceAnalysis>(F),
FAM.getResult<ScalarEvolutionAnalysis>(F),
NormalizeResults);
FAM.getResult<LoopAnalysis>(F), NormalizeResults);
return PreservedAnalyses::all();
}

Expand Down Expand Up @@ -670,6 +866,81 @@ bool DependenceInfo::intersectConstraints(Constraint *X, const Constraint *Y) {
return false;
}

//===----------------------------------------------------------------------===//
// SCEVMonotonicity

SCEVMonotonicity::SCEVMonotonicity(SCEVMonotonicityType Type,
const SCEV *FailurePoint)
: Type(Type), FailurePoint(FailurePoint) {
assert(
((Type == SCEVMonotonicityType::Unknown) == (FailurePoint != nullptr)) &&
"FailurePoint must be provided iff Type is Unknown");
}

void SCEVMonotonicity::print(raw_ostream &OS, unsigned Depth) const {
OS.indent(Depth) << "Monotonicity: ";
switch (Type) {
case SCEVMonotonicityType::Unknown:
assert(FailurePoint && "FailurePoint must be provided for Unknown");
OS << "Unknown\n";
OS.indent(Depth) << "Reason: " << *FailurePoint << "\n";
break;
case SCEVMonotonicityType::Invariant:
OS << "Invariant\n";
break;
case SCEVMonotonicityType::MultivariateSignedMonotonic:
OS << "MultivariateSignedMonotonic\n";
break;
}
}

bool SCEVMonotonicityChecker::isLoopInvariant(const SCEV *Expr) const {
return !OutermostLoop || SE->isLoopInvariant(Expr, OutermostLoop);
}

SCEVMonotonicity SCEVMonotonicityChecker::invariantOrUnknown(const SCEV *Expr) {
if (isLoopInvariant(Expr))
return SCEVMonotonicity(SCEVMonotonicityType::Invariant);
return createUnknown(Expr);
}

SCEVMonotonicity
SCEVMonotonicityChecker::checkMonotonicity(const SCEV *Expr,
const Loop *OutermostLoop) {
assert(Expr->getType()->isIntegerTy() && "Expr must be integer type");
this->OutermostLoop = OutermostLoop;
return visit(Expr);
}

/// We only care about an affine AddRec at the moment. For an affine AddRec,
/// the monotonicity can be inferred from its nowrap property. For example, let
/// X and Y be loop-invariant, and assume Y is non-negative. An AddRec
/// {X,+.Y}<nsw> implies:
///
/// X <=s (X + Y) <=s ((X + Y) + Y) <=s ...
///
/// Thus, we can conclude that the AddRec is monotonically increasing with
/// respect to the associated loop in a signed sense. The similar reasoning
/// applies when Y is non-positive, leading to a monotonically decreasing
/// AddRec.
SCEVMonotonicity
SCEVMonotonicityChecker::visitAddRecExpr(const SCEVAddRecExpr *Expr) {
if (!Expr->isAffine() || !Expr->hasNoSignedWrap())
return createUnknown(Expr);

const SCEV *Start = Expr->getStart();
const SCEV *Step = Expr->getStepRecurrence(*SE);

SCEVMonotonicity StartMon = visit(Start);
if (StartMon.isUnknown())
return StartMon;

if (!isLoopInvariant(Step))
return createUnknown(Expr);

return SCEVMonotonicity(SCEVMonotonicityType::MultivariateSignedMonotonic);
}

//===----------------------------------------------------------------------===//
// DependenceInfo methods

Expand Down Expand Up @@ -3488,10 +3759,19 @@ bool DependenceInfo::tryDelinearize(Instruction *Src, Instruction *Dst,
// resize Pair to contain as many pairs of subscripts as the delinearization
// has found, and then initialize the pairs following the delinearization.
Pair.resize(Size);
SCEVMonotonicityChecker MonChecker(SE);
const Loop *OutermostLoop = SrcLoop ? SrcLoop->getOutermostLoop() : nullptr;
for (int I = 0; I < Size; ++I) {
Pair[I].Src = SrcSubscripts[I];
Pair[I].Dst = DstSubscripts[I];
unifySubscriptType(&Pair[I]);

if (EnableMonotonicityCheck) {
if (MonChecker.checkMonotonicity(Pair[I].Src, OutermostLoop).isUnknown())
return false;
if (MonChecker.checkMonotonicity(Pair[I].Dst, OutermostLoop).isUnknown())
return false;
}
Comment on lines +3769 to +3774
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question here, otherwise LGTM.

If we have mutliple subscripts and all of them are monotonic, how could the other monotonicity check (line 4083-4) fail? We need to answer this to make sure we are not running the test redundantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following case (godbolt):

; char A[][32];
; for (i = 0; i < 1ll << 62; i++)
;   for (j = 0; j < 32; j++)
;     if (i < (1ll << 57))
;       A[i][j] = 0;
define void @outer_loop_may_wrap(ptr %a) {
entry:
  br label %loop.i.header

loop.i.header:
  %i = phi i64 [ 0, %entry ], [ %i.inc, %loop.i.latch ]
  br label %loop.j.header

loop.j.header:
  %j = phi i64 [ 0, %loop.i.header ], [ %j.inc, %loop.j.latch ]
  %cond = icmp slt i64 %i, 144115188075855872  ; 2^57
  br i1 %cond, label %if.then, label %loop.j.latch

if.then:
  %gep = getelementptr inbounds [32 x i8], ptr %a, i64 %i, i64 %j
  store i8 0, ptr %gep
  br label %loop.j.latch

loop.j.latch:
  %j.inc = add nuw nsw i64 %j, 1
  %ec.j = icmp eq i64 %j.inc, 32
  br i1 %ec.j, label %loop.i.latch, label %loop.j.header

loop.i.latch:
  %i.inc = add nuw nsw i64 %i, 1
  %ec.i = icmp eq i64 %i.inc, 4611686018427387904  ; 2^62
  br i1 %ec.i, label %exit, label %loop.i.header


exit:
  ret void
}

The subscripts {0,+,1}<nuw><nsw><%loop.i.header> and {0,+,1}<nuw><nsw><%loop.j.header> are monotonic, but the original offset {{0,+,32}<%loop.i.header>,+,1}<nw><%loop.j.header> is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the above test case.

}

return true;
Expand Down Expand Up @@ -3824,6 +4104,14 @@ DependenceInfo::depends(Instruction *Src, Instruction *Dst,
Pair[0].Src = SrcEv;
Pair[0].Dst = DstEv;

SCEVMonotonicityChecker MonChecker(SE);
const Loop *OutermostLoop = SrcLoop ? SrcLoop->getOutermostLoop() : nullptr;
if (EnableMonotonicityCheck)
if (MonChecker.checkMonotonicity(Pair[0].Src, OutermostLoop).isUnknown() ||
MonChecker.checkMonotonicity(Pair[0].Dst, OutermostLoop).isUnknown())
Comment on lines +4109 to +4111
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a basic question about these two tests here: If we have an AddRec with a nsw flag, that means this AddRec doesn't wrap. Why that is not enough and we need to recursively check each component of AddRec?

I guess the flags from SCEV assume all the internal components are fixed and only the top level calculation doesn't overflow? Is that correct?

In that case you may want to have a testcase where the top level AddRec has nsw, but monotonicity fails. I didn't see that in your test, but in other test files we have examples of that. It is helpful to add that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However the example that I see is this loop (the first test in SameSDLoops.ll)

;;  for (long int i = 0; i < 10; i++) {
;;    for (long int j = 0; j < 10; j++) {
;;      for (long int k = 0; k < 10; k++) {
;;        for (long int l = 0; l < 10; l++)
;;          A[i][j][k][l] = i;
;;      }
;;      for (long int k = 1; k < 11; k++) {
;;        for (long int l = 0; l < 10; l++)
;;          A[i + 4][j + 3][k + 2][l + 1] = l;

It is strange that we cannot prove monotonicity here:

Printing analysis 'Dependence Analysis' for function 'samebd0':
Monotonicity check:
  Inst:   store i64 %i.013, ptr %arrayidx12, align 8
    Expr: {{{{0,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nuw><nsw><%for.cond4.preheader>,+,800}<nuw><nsw><%for.cond7.preheader>,+,8}<nuw><nsw><%for.body9>
    Monotonicity: MultiSignedMonotonic
  Inst:   store i64 %l17.04, ptr %arrayidx24, align 8
    Expr: {{{{32242408,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nw><%for.cond4.preheader>,+,800}<nuw><nsw><%for.cond18.preheader>,+,8}<nuw><nsw><%for.body20>
    Monotonicity: Unknown
    Reason: {{32242408,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nw><%for.cond4.preheader>

Copy link
Contributor Author

@kasuga-fj kasuga-fj Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a basic question about these two tests here: If we have an AddRec with a nsw flag, that means this AddRec doesn't wrap. Why that is not enough and we need to recursively check each component of AddRec?

I guess the flags from SCEV assume all the internal components are fixed and only the top level calculation doesn't overflow? Is that correct?

In my understanding, your guess is correct. I added a test case @outer_loop_may_wrap, which I believe demonstrates the scenario where only the outer addrec is guaranteed not to wrap.

However the example that I see is this loop (the first test in SameSDLoops.ll)

;;  for (long int i = 0; i < 10; i++) {
;;    for (long int j = 0; j < 10; j++) {
;;      for (long int k = 0; k < 10; k++) {
;;        for (long int l = 0; l < 10; l++)
;;          A[i][j][k][l] = i;
;;      }
;;      for (long int k = 1; k < 11; k++) {
;;        for (long int l = 0; l < 10; l++)
;;          A[i + 4][j + 3][k + 2][l + 1] = l;

It is strange that we cannot prove monotonicity here:

Printing analysis 'Dependence Analysis' for function 'samebd0':
Monotonicity check:
  Inst:   store i64 %i.013, ptr %arrayidx12, align 8
    Expr: {{{{0,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nuw><nsw><%for.cond4.preheader>,+,800}<nuw><nsw><%for.cond7.preheader>,+,8}<nuw><nsw><%for.body9>
    Monotonicity: MultiSignedMonotonic
  Inst:   store i64 %l17.04, ptr %arrayidx24, align 8
    Expr: {{{{32242408,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nw><%for.cond4.preheader>,+,800}<nuw><nsw><%for.cond18.preheader>,+,8}<nuw><nsw><%for.body20>
    Monotonicity: Unknown
    Reason: {{32242408,+,8000000}<nuw><nsw><%for.cond1.preheader>,+,80000}<nw><%for.cond4.preheader>

I don't know much about how nowrap flags are transferred from IR to SCEV, but this appears to be a limitation of SCEV. At a glance, it’s not obvious that the second store A[i + 4][j + 3][k + 2][l + 1] = l is always executed when entering the j-loop. This may be the reason why the nowrap flags for %for.cond4.preheader are not preserved in SCEV.

Anyway, for this specific case, I think we could perform additional cheap analysis similar to range analysis in SCEV, since all values except the induction variables are constants. That said, I'm not planning to include such a feature in this PR.

return std::make_unique<Dependence>(Src, Dst,
SCEVUnionPredicate(Assume, *SE));

if (Delinearize) {
if (tryDelinearize(Src, Dst, Pair)) {
LLVM_DEBUG(dbgs() << " delinearized\n");
Expand Down
Loading
Loading