[AArch64] Extend condition optimizer to support unsigned comparisons #144380

AZero13 · 2025-06-16T16:05:31Z

We have to be extra careful to not allow unsigned wraps, however. This also required some adjusting of the logic in adjustCmp, as well as compare the true imm value with add or sub taken into effect.

Because SIGNED_MIN and SIGNED_MAX cannot be an immediate, we do not need to worry about those edge cases when dealing with unsigned comparisons.

llvmbot · 2025-06-16T16:06:03Z

@llvm/pr-subscribers-backend-aarch64

Author: AZero13 (AZero13)

Changes

We have to be extra careful to not allow unsigned wraps, however. This also required some adjusting of the logic in adjustCmp, as well as compare the true imm value with add or sub taken into effect.

Because SIGNED_MIN and SIGNED_MAX cannot be an immediate, we do not need to worry about those edge cases when dealing with unsigned comparisons.

Patch is 29.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/144380.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp (+54-16)
(modified) llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll (+662)

diff --git a/llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp b/llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp
index 4c9f8c2723493..230dac417eae0 100644
--- a/llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp
@@ -231,6 +231,10 @@ static AArch64CC::CondCode getAdjustedCmp(AArch64CC::CondCode Cmp) {
   case AArch64CC::GE: return AArch64CC::GT;
   case AArch64CC::LT: return AArch64CC::LE;
   case AArch64CC::LE: return AArch64CC::LT;
+  case AArch64CC::HI: return AArch64CC::HS;
+  case AArch64CC::HS: return AArch64CC::HI;
+  case AArch64CC::LO: return AArch64CC::LS;
+  case AArch64CC::LS: return AArch64CC::LO;
   default:
     llvm_unreachable("Unexpected condition code");
   }
@@ -238,15 +242,20 @@ static AArch64CC::CondCode getAdjustedCmp(AArch64CC::CondCode Cmp) {
 
 // Transforms GT -> GE, GE -> GT, LT -> LE, LE -> LT by updating comparison
 // operator and condition code.
-AArch64ConditionOptimizer::CmpInfo AArch64ConditionOptimizer::adjustCmp(
-    MachineInstr *CmpMI, AArch64CC::CondCode Cmp) {
+AArch64ConditionOptimizer::CmpInfo
+AArch64ConditionOptimizer::adjustCmp(MachineInstr *CmpMI,
+                                     AArch64CC::CondCode Cmp) {
   unsigned Opc = CmpMI->getOpcode();
+  unsigned OldOpc = Opc;
+
+  bool isSigned = Cmp == AArch64CC::GT || Cmp == AArch64CC::GE ||
+                  Cmp == AArch64CC::LT || Cmp == AArch64CC::LE;
 
   // CMN (compare with negative immediate) is an alias to ADDS (as
   // "operand - negative" == "operand + positive")
   bool Negative = (Opc == AArch64::ADDSWri || Opc == AArch64::ADDSXri);
 
-  int Correction = (Cmp == AArch64CC::GT) ? 1 : -1;
+  int Correction = (Cmp == AArch64CC::GT || Cmp == AArch64CC::HI) ? 1 : -1;
   // Negate Correction value for comparison with negative immediate (CMN).
   if (Negative) {
     Correction = -Correction;
@@ -255,13 +264,23 @@ AArch64ConditionOptimizer::CmpInfo AArch64ConditionOptimizer::adjustCmp(
   const int OldImm = (int)CmpMI->getOperand(2).getImm();
   const int NewImm = std::abs(OldImm + Correction);
 
-  // Handle +0 -> -1 and -0 -> +1 (CMN with 0 immediate) transitions by
-  // adjusting compare instruction opcode.
-  if (OldImm == 0 && ((Negative && Correction == 1) ||
-                      (!Negative && Correction == -1))) {
+  // Handle cmn 1 -> cmp 0, transitions by adjusting compare instruction opcode.
+  if (OldImm == 1 && Negative && Correction == -1) {
+    // If we are adjusting from -1 to 0, we need to change the opcode.
+    Opc = getComplementOpc(Opc);
+  }
+
+  // Handle +0 -> -1 transitions by adjusting compare instruction opcode.
+  assert((OldImm != 0 || !Negative) && "Should not encounter cmn 0!");
+  if (OldImm == 0 && Correction == -1) {
     Opc = getComplementOpc(Opc);
   }
 
+  // If we change opcodes, this means we did an unsigned wrap, so return the old
+  // cmp.
+  if (!isSigned && Opc != OldOpc)
+    return CmpInfo(OldImm, OldOpc, Cmp);
+
   return CmpInfo(NewImm, Opc, getAdjustedCmp(Cmp));
 }
 
@@ -323,6 +342,14 @@ bool AArch64ConditionOptimizer::adjustTo(MachineInstr *CmpMI,
   return false;
 }
 
+static bool isGreaterThan(AArch64CC::CondCode Cmp) {
+  return Cmp == AArch64CC::GT || Cmp == AArch64CC::HI;
+}
+
+static bool isLessThan(AArch64CC::CondCode Cmp) {
+  return Cmp == AArch64CC::LT || Cmp == AArch64CC::LO;
+}
+
 bool AArch64ConditionOptimizer::runOnMachineFunction(MachineFunction &MF) {
   LLVM_DEBUG(dbgs() << "********** AArch64 Conditional Compares **********\n"
                     << "********** Function: " << MF.getName() << '\n');
@@ -383,6 +410,9 @@ bool AArch64ConditionOptimizer::runOnMachineFunction(MachineFunction &MF) {
     const int HeadImm = (int)HeadCmpMI->getOperand(2).getImm();
     const int TrueImm = (int)TrueCmpMI->getOperand(2).getImm();
 
+    int HeadImmTrueValue = HeadImm;
+    int TrueImmTrueValue = TrueImm;
+
     LLVM_DEBUG(dbgs() << "Head branch:\n");
     LLVM_DEBUG(dbgs() << "\tcondition: " << AArch64CC::getCondCodeName(HeadCmp)
                       << '\n');
@@ -393,9 +423,17 @@ bool AArch64ConditionOptimizer::runOnMachineFunction(MachineFunction &MF) {
                       << '\n');
     LLVM_DEBUG(dbgs() << "\timmediate: " << TrueImm << '\n');
 
-    if (((HeadCmp == AArch64CC::GT && TrueCmp == AArch64CC::LT) ||
-         (HeadCmp == AArch64CC::LT && TrueCmp == AArch64CC::GT)) &&
-        std::abs(TrueImm - HeadImm) == 2) {
+    unsigned Opc = HeadCmpMI->getOpcode();
+    if (Opc == AArch64::ADDSWri || Opc == AArch64::ADDSXri)
+      HeadImmTrueValue = -HeadImmTrueValue;
+
+    Opc = TrueCmpMI->getOpcode();
+    if (Opc == AArch64::ADDSWri || Opc == AArch64::ADDSXri)
+      TrueImmTrueValue = -TrueImmTrueValue;
+
+    if (((isGreaterThan(HeadCmp) && isLessThan(TrueCmp)) ||
+         (isLessThan(HeadCmp) && isGreaterThan(TrueCmp))) &&
+        std::abs(TrueImmTrueValue - HeadImmTrueValue) == 2) {
       // This branch transforms machine instructions that correspond to
       //
       // 1) (a > {TrueImm} && ...) || (a < {HeadImm} && ...)
@@ -414,9 +452,9 @@ bool AArch64ConditionOptimizer::runOnMachineFunction(MachineFunction &MF) {
         modifyCmp(TrueCmpMI, TrueCmpInfo);
         Changed = true;
       }
-    } else if (((HeadCmp == AArch64CC::GT && TrueCmp == AArch64CC::GT) ||
-                (HeadCmp == AArch64CC::LT && TrueCmp == AArch64CC::LT)) &&
-                std::abs(TrueImm - HeadImm) == 1) {
+    } else if (((isGreaterThan(HeadCmp) && isGreaterThan(TrueCmp)) ||
+                (isLessThan(HeadCmp) && isLessThan(TrueCmp))) &&
+               std::abs(TrueImmTrueValue - HeadImmTrueValue) == 1) {
       // This branch transforms machine instructions that correspond to
       //
       // 1) (a > {TrueImm} && ...) || (a > {HeadImm} && ...)
@@ -429,9 +467,9 @@ bool AArch64ConditionOptimizer::runOnMachineFunction(MachineFunction &MF) {
 
       // GT -> GE transformation increases immediate value, so picking the
       // smaller one; LT -> LE decreases immediate value so invert the choice.
-      bool adjustHeadCond = (HeadImm < TrueImm);
-      if (HeadCmp == AArch64CC::LT) {
-          adjustHeadCond = !adjustHeadCond;
+      bool adjustHeadCond = (HeadImmTrueValue < TrueImmTrueValue);
+      if (isLessThan(HeadCmp)) {
+        adjustHeadCond = !adjustHeadCond;
       }
 
       if (adjustHeadCond) {
diff --git a/llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll b/llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll
index 6449c3e11d667..1a8e5bc249b26 100644
--- a/llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll
+++ b/llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll
@@ -845,6 +845,668 @@ return:                                           ; preds = %if.end, %land.lhs.t
   ret i32 %retval.0
 }
 
+; (a > 10 && b == c) || (a >= 10 && b == d)
+define i32 @combine_ugt_uge_10() #0 {
+; CHECK-LABEL: combine_ugt_uge_10:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmp w8, #10
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    b.ls .LBB12_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w10, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w10, w9
+; CHECK-NEXT:    b.ne .LBB12_4
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB12_3: // %lor.lhs.false
+; CHECK-NEXT:    b.lo .LBB12_6
+; CHECK-NEXT:  .LBB12_4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB12_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB12_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ugt i32 %0, 10
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %land.lhs.true3
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ugt i32 %0, 9
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false, %land.lhs.true
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3, %4
+  br i1 %cmp4, label %return, label %if.end
+
+if.end:                                           ; preds = %land.lhs.true3, %lor.lhs.false
+  br label %return
+
+return:                                           ; preds = %if.end, %land.lhs.true3, %land.lhs.true
+  %retval.0 = phi i32 [ 0, %if.end ], [ 1, %land.lhs.true3 ], [ 1, %land.lhs.true ]
+  ret i32 %retval.0
+}
+
+; (a > 5 && b == c) || (a < 5 && b == d)
+define i32 @combine_ugt_ult_5() #0 {
+; CHECK-LABEL: combine_ugt_ult_5:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmp w8, #5
+; CHECK-NEXT:    b.ls .LBB13_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB13_6
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB13_3: // %lor.lhs.false
+; CHECK-NEXT:    b.hs .LBB13_6
+; CHECK-NEXT:  // %bb.4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB13_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB13_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ugt i32 %0, 5
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %if.end
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ult i32 %0, 5
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3, %4
+  br i1 %cmp4, label %return, label %if.end
+
+if.end:                                           ; preds = %land.lhs.true3, %lor.lhs.false, %land.lhs.true
+  br label %return
+
+return:                                           ; preds = %if.end, %land.lhs.true3, %land.lhs.true
+  %retval.0 = phi i32 [ 0, %if.end ], [ 1, %land.lhs.true3 ], [ 1, %land.lhs.true ]
+  ret i32 %retval.0
+}
+
+; (a < 5 && b == c) || (a <= 5 && b == d)
+define i32 @combine_ult_uge_5() #0 {
+; CHECK-LABEL: combine_ult_uge_5:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmp w8, #5
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    b.hs .LBB14_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w10, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w10, w9
+; CHECK-NEXT:    b.ne .LBB14_4
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB14_3: // %lor.lhs.false
+; CHECK-NEXT:    b.hi .LBB14_6
+; CHECK-NEXT:  .LBB14_4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB14_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB14_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ult i32 %0, 5
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %land.lhs.true3
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ult i32 %0, 6
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false, %land.lhs.true
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3, %4
+  br i1 %cmp4, label %return, label %if.end
+
+if.end:                                           ; preds = %land.lhs.true3, %lor.lhs.false
+  br label %return
+
+return:                                           ; preds = %if.end, %land.lhs.true3, %land.lhs.true
+  %retval.0 = phi i32 [ 0, %if.end ], [ 1, %land.lhs.true3 ], [ 1, %land.lhs.true ]
+  ret i32 %retval.0
+}
+
+; (a < 5 && b == c) || (a > 5 && b == d)
+define i32 @combine_ult_ugt_5() #0 {
+; CHECK-LABEL: combine_ult_ugt_5:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmp w8, #5
+; CHECK-NEXT:    b.hs .LBB15_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB15_6
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB15_3: // %lor.lhs.false
+; CHECK-NEXT:    b.ls .LBB15_6
+; CHECK-NEXT:  // %bb.4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB15_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB15_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ult i32 %0, 5
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %if.end
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ugt i32 %0, 5
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3, %4
+  br i1 %cmp4, label %return, label %if.end
+
+if.end:                                           ; preds = %land.lhs.true3, %lor.lhs.false, %land.lhs.true
+  br label %return
+
+return:                                           ; preds = %if.end, %land.lhs.true3, %land.lhs.true
+  %retval.0 = phi i32 [ 0, %if.end ], [ 1, %land.lhs.true3 ], [ 1, %land.lhs.true ]
+  ret i32 %retval.0
+}
+
+; (a > -5 && b == c) || (a < -5 && b == d)
+define i32 @combine_ugt_ult_n5() #0 {
+; CHECK-LABEL: combine_ugt_ult_n5:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmn w8, #5
+; CHECK-NEXT:    b.ls .LBB16_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB16_6
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB16_3: // %lor.lhs.false
+; CHECK-NEXT:    b.hs .LBB16_6
+; CHECK-NEXT:  // %bb.4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB16_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB16_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ugt i32 %0, -5
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %if.end
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ult i32 %0, -5
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3, %4
+  br i1 %cmp4, label %return, label %if.end
+
+if.end:                                           ; preds = %land.lhs.true3, %lor.lhs.false, %land.lhs.true
+  br label %return
+
+return:                                           ; preds = %if.end, %land.lhs.true3, %land.lhs.true
+  %retval.0 = phi i32 [ 0, %if.end ], [ 1, %land.lhs.true3 ], [ 1, %land.lhs.true ]
+  ret i32 %retval.0
+}
+
+; (a < -5 && b == c) || (a > -5 && b == d)
+define i32 @combine_ult_ugt_n5() #0 {
+; CHECK-LABEL: combine_ult_ugt_n5:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    adrp x8, :got:a
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:a]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    cmn w8, #5
+; CHECK-NEXT:    b.hs .LBB17_3
+; CHECK-NEXT:  // %bb.1: // %land.lhs.true
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:c
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:c]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB17_6
+; CHECK-NEXT:  // %bb.2:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB17_3: // %lor.lhs.false
+; CHECK-NEXT:    b.ls .LBB17_6
+; CHECK-NEXT:  // %bb.4: // %land.lhs.true3
+; CHECK-NEXT:    adrp x8, :got:b
+; CHECK-NEXT:    adrp x9, :got:d
+; CHECK-NEXT:    ldr x8, [x8, :got_lo12:b]
+; CHECK-NEXT:    ldr x9, [x9, :got_lo12:d]
+; CHECK-NEXT:    ldr w8, [x8]
+; CHECK-NEXT:    ldr w9, [x9]
+; CHECK-NEXT:    cmp w8, w9
+; CHECK-NEXT:    b.ne .LBB17_6
+; CHECK-NEXT:  // %bb.5:
+; CHECK-NEXT:    mov w0, #1 // =0x1
+; CHECK-NEXT:    ret
+; CHECK-NEXT:  .LBB17_6: // %if.end
+; CHECK-NEXT:    mov w0, wzr
+; CHECK-NEXT:    ret
+entry:
+  %0 = load i32, ptr @a, align 4
+  %cmp = icmp ult i32 %0, -5
+  br i1 %cmp, label %land.lhs.true, label %lor.lhs.false
+
+land.lhs.true:                                    ; preds = %entry
+  %1 = load i32, ptr @b, align 4
+  %2 = load i32, ptr @c, align 4
+  %cmp1 = icmp eq i32 %1, %2
+  br i1 %cmp1, label %return, label %if.end
+
+lor.lhs.false:                                    ; preds = %entry
+  %cmp2 = icmp ugt i32 %0, -5
+  br i1 %cmp2, label %land.lhs.true3, label %if.end
+
+land.lhs.true3:                                   ; preds = %lor.lhs.false
+  %3 = load i32, ptr @b, align 4
+  %4 = load i32, ptr @d, align 4
+  %cmp4 = icmp eq i32 %3...
[truncated]

AZero13 · 2025-06-16T16:16:52Z

Future PR: I don't want to have a situation where we jump via bls -> blo -> where we wanna go

Instead we can just do

blo (if bls is first) blo-loc
then:
bls bls-loc because one implies the other.

AZero13 · 2025-06-22T18:39:20Z

@topperc What do you think of this?

We have to be extra careful to not allow unsigned wraps, however. This also required some adjusting of the logic in adjustCmp, as well as compare the true imm value with add or sub taken into effect. Because SIGNED_MIN and SIGNED_MAX cannot be an immediate, we do not need to worry about those edge cases when dealing with unsigned comparisons.

llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll

AZero13 · 2025-06-23T23:41:07Z

@davemgreen I tried to make the tests. However, I found it best to just copy paste the original tests, changed signed to unsigned, even variants that mixed both signed and unsigned.

AZero13 · 2025-06-23T23:46:01Z

So, basically everything was tested. Which is why I made sure to ensure that all cases were covered.

For example: x > 0 || x < -1, does not work when unsigned, so how do I check this? by checking that we switched from adds to subs.

I used a helper function to check between both signed and unsigned variants, and the realheadimm was because it is important to know what we are truly comparing again.

AZero13 · 2025-07-09T01:50:23Z

@davemgreen ping

AZero13 · 2025-07-10T22:13:08Z

@davemgreen If you'd like I can explain every change I made and why this works. This was entirely coded from hand, grouping the types of checks and ensuring that the 0 boundary cannot be crossed on unsigned.

llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp

davemgreen

Why can't we write simpler mir tests for this?

AZero13 · 2025-07-11T12:25:33Z

Why can't we write simpler mir tests for this?

Because then they will get folded to conditional selects and not branches

AZero13 · 2025-07-11T12:29:27Z

These are literally just copy pastes of the old tests but modified to work with unsigned and mixed versions @davemgreen

There's nothing that complex about it.

AZero13 · 2025-07-11T14:13:02Z

These are literally just copy pastes of the old tests but modified to work with unsigned and mixed versions @davemgreen

There's nothing that complex about it.

To prevent the conditions from being condensed to a conditional compare, it needs to be complex to ensure these conditions are implemented as branches in the asm. That is why these are complex.

davemgreen · 2025-07-13T22:41:14Z

Hi. Ideally we would be able to come up with a formal proof to ensure we are not missing anything. Like you said some of the cases are difficult to reach though. Something like this is what I was thinking of for mir test, that can be more concise but we can still use for testing. Different variants can be made up.

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -mtriple=aarch64 -o - %s -run-pass=aarch64-condopt | FileCheck %s

---
name:            add_add_hi_hi
tracksRegLiveness: true
body:             |
  ; CHECK-LABEL: name: add_add_hi_hi
  ; CHECK: bb.0:
  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
  ; CHECK-NEXT:   liveins: $w0
  ; CHECK-NEXT: {{  $}}
  ; CHECK-NEXT:   [[COPY:%[0-9]+]]:gpr32common = COPY $w0
  ; CHECK-NEXT:   [[ADDSWri:%[0-9]+]]:gpr32 = ADDSWri [[COPY]], 0, 0, implicit-def $nzcv
  ; CHECK-NEXT:   Bcc 12, %bb.1, implicit $nzcv
  ; CHECK-NEXT:   B %bb.2
  ; CHECK-NEXT: {{  $}}
  ; CHECK-NEXT: bb.1:
  ; CHECK-NEXT:   successors: %bb.2(0x80000000)
  ; CHECK-NEXT:   liveins: $w0
  ; CHECK-NEXT: {{  $}}
  ; CHECK-NEXT:   [[ADDSWri1:%[0-9]+]]:gpr32 = ADDSWri [[COPY]], 1, 0, implicit-def $nzcv
  ; CHECK-NEXT:   Bcc 12, %bb.2, implicit $nzcv
  ; CHECK-NEXT:   B %bb.2
  ; CHECK-NEXT: {{  $}}
  ; CHECK-NEXT: bb.2:
  ; CHECK-NEXT:   liveins: $w0
  ; CHECK-NEXT: {{  $}}
  ; CHECK-NEXT:   [[MOVi32imm:%[0-9]+]]:gpr32 = MOVi32imm 1
  ; CHECK-NEXT:   $w0 = COPY [[MOVi32imm]]
  ; CHECK-NEXT:   RET undef $lr, implicit $w0
  bb.0:
    liveins: $w0
    %0:gpr32common = COPY $w0
    %4:gpr32 = ADDSWri %0:gpr32common, 0, 0, implicit-def $nzcv
    Bcc 12, %bb.2, implicit $nzcv
    B %bb.3

  bb.2:
    liveins: $w0
    %5:gpr32 = ADDSWri %0:gpr32common, 1, 0, implicit-def $nzcv
    Bcc 12, %bb.3, implicit $nzcv
    B %bb.3

  bb.3:
    liveins: $w0
    %1:gpr32 = MOVi32imm 1
    $w0 = COPY %1
    RET undef $lr, implicit $w0
...

llvmbot added the backend:AArch64 label Jun 16, 2025

AZero13 force-pushed the OR branch from 4ca6a8b to e1fc9e4 Compare June 16, 2025 16:06

AZero13 changed the title ~~[AArch64] Add support for unsigned comparisons~~ [AArch64] Add support for unsigned comparisons in the condition optimizer Jun 16, 2025

AZero13 force-pushed the OR branch from e1fc9e4 to 6b259ed Compare June 16, 2025 16:10

AZero13 mentioned this pull request Jun 16, 2025

[AArch64] Check for negative numbers when adjusting icmps #141151

Merged

AZero13 force-pushed the OR branch from 6b259ed to 57067f4 Compare June 20, 2025 18:43

AZero13 added 2 commits June 22, 2025 19:22

Pre-commit tests (NFC)

4845543

AZero13 force-pushed the OR branch from 57067f4 to 0415253 Compare June 22, 2025 23:22

AZero13 changed the title ~~[AArch64] Add support for unsigned comparisons in the condition optimizer~~ [AArch64] Extend condition optimizer to support unsigned comparisons Jun 23, 2025

davemgreen reviewed Jun 23, 2025

View reviewed changes

llvm/test/CodeGen/AArch64/combine-comparisons-by-cse.ll Show resolved Hide resolved

AZero13 requested a review from davemgreen June 23, 2025 23:46

topperc reviewed Jul 10, 2025

View reviewed changes

llvm/lib/Target/AArch64/AArch64ConditionOptimizer.cpp Outdated Show resolved Hide resolved

AZero13 added 2 commits July 10, 2025 18:44

Rename to match coding standards

f22cc99

Fix mistake

ca002ee

davemgreen reviewed Jul 11, 2025

View reviewed changes

AZero13 requested review from topperc and davemgreen July 11, 2025 14:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Extend condition optimizer to support unsigned comparisons #144380

[AArch64] Extend condition optimizer to support unsigned comparisons #144380

Uh oh!

AZero13 commented Jun 16, 2025

Uh oh!

llvmbot commented Jun 16, 2025

Uh oh!

AZero13 commented Jun 16, 2025

Uh oh!

AZero13 commented Jun 22, 2025

Uh oh!

Uh oh!

AZero13 commented Jun 23, 2025

Uh oh!

AZero13 commented Jun 23, 2025 •

edited

Loading

Uh oh!

AZero13 commented Jul 9, 2025

Uh oh!

AZero13 commented Jul 10, 2025

Uh oh!

Uh oh!

davemgreen left a comment

Uh oh!

AZero13 commented Jul 11, 2025

Uh oh!

AZero13 commented Jul 11, 2025

Uh oh!

AZero13 commented Jul 11, 2025 •

edited

Loading

Uh oh!

davemgreen commented Jul 13, 2025

Uh oh!

Uh oh!

[AArch64] Extend condition optimizer to support unsigned comparisons #144380

Are you sure you want to change the base?

[AArch64] Extend condition optimizer to support unsigned comparisons #144380

Uh oh!

Conversation

AZero13 commented Jun 16, 2025

Uh oh!

llvmbot commented Jun 16, 2025

Uh oh!

AZero13 commented Jun 16, 2025

Uh oh!

AZero13 commented Jun 22, 2025

Uh oh!

Uh oh!

AZero13 commented Jun 23, 2025

Uh oh!

AZero13 commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AZero13 commented Jul 9, 2025

Uh oh!

AZero13 commented Jul 10, 2025

Uh oh!

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

AZero13 commented Jul 11, 2025

Uh oh!

AZero13 commented Jul 11, 2025

Uh oh!

AZero13 commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davemgreen commented Jul 13, 2025

Uh oh!

Uh oh!

AZero13 commented Jun 23, 2025 •

edited

Loading

AZero13 commented Jul 11, 2025 •

edited

Loading