[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem #154072

Mel-Chen · 2025-08-18T08:07:06Z

Since div/rem operations don’t support a mask operand, the lanes of the divisor that are masked out are currently replaced with 1 using VPInstruction::Select before the predicated div/rem operation.
This patch replaces

  VPInstruction::Select(logical_and(header_mask, conditional_mask), LHS, RHS)

with

  vp.merge(conditional_mask, LHS, RHS, EVL)

so that the header mask can be replaced by EVL in this usage scenario when tail folding with EVL.

llvmbot · 2025-08-18T08:07:41Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-backend-risc-v

Author: Mel Chen (Mel-Chen)

Changes

Since div/rem operations don’t support a mask operand, the lanes of the divisor that are masked out are currently replaced with 1 using VPInstruction::Select before the predicated div/rem operation.
This patch replaces

  VPInstruction::Select(logical_and(header_mask, conditional_mask), LHS, RHS)

with

  vp.merge(conditional_mask, LHS, RHS, EVL)

so that the header mask can be replaced by EVL in this usage scenario when tail folding with EVL.

Full diff: https://github.com/llvm/llvm-project/pull/154072.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+10-8)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll (+3-3)

diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 05c12b7a1adcc..d015a1ccf9c2a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2160,18 +2160,20 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
         return new VPReductionEVLRecipe(*Red, EVL, NewMask);
       })
       .Case<VPInstruction>([&](VPInstruction *VPI) -> VPRecipeBase * {
-        VPValue *LHS, *RHS;
+        VPValue *Cond, *LHS, *RHS;
         // Transform select with a header mask condition
-        //   select(header_mask, LHS, RHS)
+        //   select(mask_w/_header_mask, LHS, RHS)
         // into vector predication merge.
-        //   vp.merge(all-true, LHS, RHS, EVL)
-        if (!match(VPI, m_Select(m_Specific(HeaderMask), m_VPValue(LHS),
-                                 m_VPValue(RHS))))
+        //   vp.merge(mask_w/o_header_mask, LHS, RHS, EVL)
+        if (!match(VPI,
+                   m_Select(m_VPValue(Cond), m_VPValue(LHS), m_VPValue(RHS))))
           return nullptr;
-        // Use all true as the condition because this transformation is
-        // limited to selects whose condition is a header mask.
+
+	VPValue *NewMask = GetNewMask(Cond);
+	if (!NewMask)
+	  NewMask = &AllOneMask;
         return new VPWidenIntrinsicRecipe(
-            Intrinsic::vp_merge, {&AllOneMask, LHS, RHS, &EVL},
+            Intrinsic::vp_merge, {NewMask, LHS, RHS, &EVL},
             TypeInfo.inferScalarType(LHS), VPI->getDebugLoc());
       })
       .Default([&](VPRecipeBase *R) { return nullptr; });
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll b/llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
index 3af328fb6568e..7efaf2080810b 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
@@ -371,7 +371,7 @@ define void @predicated_udiv(ptr noalias nocapture %a, i64 %v, i64 %n) {
 ; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i64, ptr [[A:%.*]], i64 [[INDEX]]
 ; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = call <vscale x 2 x i64> @llvm.vp.load.nxv2i64.p0(ptr align 8 [[TMP8]], <vscale x 2 x i1> splat (i1 true), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP16:%.*]] = select <vscale x 2 x i1> [[TMP15]], <vscale x 2 x i1> [[TMP6]], <vscale x 2 x i1> zeroinitializer
-; CHECK-NEXT:    [[TMP10:%.*]] = select <vscale x 2 x i1> [[TMP16]], <vscale x 2 x i64> [[BROADCAST_SPLAT]], <vscale x 2 x i64> splat (i64 1)
+; CHECK-NEXT:    [[TMP10:%.*]] = call <vscale x 2 x i64> @llvm.vp.merge.nxv2i64(<vscale x 2 x i1> [[TMP6]], <vscale x 2 x i64> [[BROADCAST_SPLAT]], <vscale x 2 x i64> splat (i64 1), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP11:%.*]] = udiv <vscale x 2 x i64> [[WIDE_LOAD]], [[TMP10]]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <vscale x 2 x i1> [[TMP16]], <vscale x 2 x i64> [[TMP11]], <vscale x 2 x i64> [[WIDE_LOAD]]
 ; CHECK-NEXT:    call void @llvm.vp.store.nxv2i64.p0(<vscale x 2 x i64> [[PREDPHI]], ptr align 8 [[TMP8]], <vscale x 2 x i1> splat (i1 true), i32 [[TMP12]])
@@ -486,7 +486,7 @@ define void @predicated_sdiv(ptr noalias nocapture %a, i64 %v, i64 %n) {
 ; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i64, ptr [[A:%.*]], i64 [[INDEX]]
 ; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = call <vscale x 2 x i64> @llvm.vp.load.nxv2i64.p0(ptr align 8 [[TMP8]], <vscale x 2 x i1> splat (i1 true), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP16:%.*]] = select <vscale x 2 x i1> [[TMP15]], <vscale x 2 x i1> [[TMP6]], <vscale x 2 x i1> zeroinitializer
-; CHECK-NEXT:    [[TMP10:%.*]] = select <vscale x 2 x i1> [[TMP16]], <vscale x 2 x i64> [[BROADCAST_SPLAT]], <vscale x 2 x i64> splat (i64 1)
+; CHECK-NEXT:    [[TMP10:%.*]] = call <vscale x 2 x i64> @llvm.vp.merge.nxv2i64(<vscale x 2 x i1> [[TMP6]], <vscale x 2 x i64> [[BROADCAST_SPLAT]], <vscale x 2 x i64> splat (i64 1), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP11:%.*]] = sdiv <vscale x 2 x i64> [[WIDE_LOAD]], [[TMP10]]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <vscale x 2 x i1> [[TMP16]], <vscale x 2 x i64> [[TMP11]], <vscale x 2 x i64> [[WIDE_LOAD]]
 ; CHECK-NEXT:    call void @llvm.vp.store.nxv2i64.p0(<vscale x 2 x i64> [[PREDPHI]], ptr align 8 [[TMP8]], <vscale x 2 x i1> splat (i1 true), i32 [[TMP12]])
@@ -817,7 +817,7 @@ define void @predicated_sdiv_by_minus_one(ptr noalias nocapture %a, i64 %n) {
 ; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = call <vscale x 16 x i8> @llvm.vp.load.nxv16i8.p0(ptr align 1 [[TMP7]], <vscale x 16 x i1> splat (i1 true), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP9:%.*]] = icmp ne <vscale x 16 x i8> [[WIDE_LOAD]], splat (i8 -128)
 ; CHECK-NEXT:    [[TMP16:%.*]] = select <vscale x 16 x i1> [[TMP15]], <vscale x 16 x i1> [[TMP9]], <vscale x 16 x i1> zeroinitializer
-; CHECK-NEXT:    [[TMP10:%.*]] = select <vscale x 16 x i1> [[TMP16]], <vscale x 16 x i8> splat (i8 -1), <vscale x 16 x i8> splat (i8 1)
+; CHECK-NEXT:    [[TMP10:%.*]] = call <vscale x 16 x i8> @llvm.vp.merge.nxv16i8(<vscale x 16 x i1> [[TMP9]], <vscale x 16 x i8> splat (i8 -1), <vscale x 16 x i8> splat (i8 1), i32 [[TMP12]])
 ; CHECK-NEXT:    [[TMP11:%.*]] = sdiv <vscale x 16 x i8> [[WIDE_LOAD]], [[TMP10]]
 ; CHECK-NEXT:    [[PREDPHI:%.*]] = select <vscale x 16 x i1> [[TMP16]], <vscale x 16 x i8> [[TMP11]], <vscale x 16 x i8> [[WIDE_LOAD]]
 ; CHECK-NEXT:    call void @llvm.vp.store.nxv16i8.p0(<vscale x 16 x i8> [[PREDPHI]], ptr align 1 [[TMP7]], <vscale x 16 x i1> splat (i1 true), i32 [[TMP12]])

github-actions · 2025-08-18T08:09:34Z

✅ With the latest revision this PR passed the C/C++ code formatter.

lukel97 · 2025-08-18T08:38:23Z

I'm not sure if this is the right approach, since it still leaves around a vmv.v.i to mask the divisor. What I originally tried in #148828 was to fold the div/rem into a VP div/rem, but in that PR it was relying on nothing in the VPlan ever reading past EVL lanes.

What I think is safer is to emit the VP intrinsic when the recipe is initially being widened using a mask, that way we know the lanes are defined as poison, and then optimising the mask to EVL inoptimizeMaskToEVL. I've opened up #154076 for this, what do you think?

Mel-Chen · 2025-08-18T09:39:30Z

I'm not sure if this is the right approach, since it still leaves around a vmv.v.i to mask the divisor. What I originally tried in #148828 was to fold the div/rem into a VP div/rem, but in that PR it was relying on nothing in the VPlan ever reading past EVL lanes.

What I think is safer is to emit the VP intrinsic when the recipe is initially being widened using a mask, that way we know the lanes are defined as poison, and then optimising the mask to EVL inoptimizeMaskToEVL. I've opened up #154076 for this, what do you think?

This patch wasn’t intended to address the vp.div issue, actually. :Ｄ
I just notice that we never performed this transformation, and in fact it should affect correctness before we replace the header mask with the EVL mask. It just hadn’t been caught until now.

Mel-Chen · 2025-09-09T08:05:01Z

ping. Can we use this approach first to allow the header mask to be removed?

lukel97

I'm happy to have this as an incremental improvement, but this change won't be correct without #155394 landing first. The other VP recipe transforms also have the same incorrect behaviour but I'd like to avoid making the problem worse

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

lukel97

LGTM

fhahn

I'm happy to have this as an incremental improvement, but this change won't be correct without #155394 landing first. The other VP recipe transforms also have the same incorrect behaviour but I'd like to avoid making the problem worse

Just double-checking that this is still correct?

lukel97 · 2025-11-11T10:41:26Z

I'm happy to have this as an incremental improvement, but this change won't be correct without #155394 landing first. The other VP recipe transforms also have the same incorrect behaviour but I'd like to avoid making the problem worse

Just double-checking that this is still correct?

It's correct now yes, this pattern won't match unless the mask contains the header mask, so we can substitute it for EVL.

Mel-Chen requested review from LiqinWeng, alexey-bataev, fhahn and lukel97 August 18, 2025 08:07

llvmbot added backend:RISC-V vectorizers llvm:transforms labels Aug 18, 2025

Mel-Chen force-pushed the evl-get-new-select-cond branch from ebcdecc to 96d4b98 Compare August 18, 2025 08:35

lukel97 mentioned this pull request Sep 2, 2025

RISC-V EVL tail folding #123069

Open

17 tasks

Mel-Chen mentioned this pull request Sep 9, 2025

[LV][EVL] Reimplement method for extracting new mask. nfc #156827

Open

lukel97 reviewed Sep 9, 2025

View reviewed changes

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp Outdated Show resolved Hide resolved

Mel-Chen force-pushed the evl-get-new-select-cond branch from 96d4b98 to 3060d05 Compare September 11, 2025 08:34

EVL, transform VPInstruction::Select to vp.merge for div

7a206e4

Mel-Chen force-pushed the evl-get-new-select-cond branch from 3060d05 to 7a206e4 Compare November 11, 2025 10:08

lukel97 approved these changes Nov 11, 2025

View reviewed changes

fhahn reviewed Nov 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem #154072

[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem #154072

Uh oh!

Mel-Chen commented Aug 18, 2025

Uh oh!

llvmbot commented Aug 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 18, 2025 •

edited

Loading

Uh oh!

lukel97 commented Aug 18, 2025

Uh oh!

Mel-Chen commented Aug 18, 2025

Uh oh!

Mel-Chen commented Sep 9, 2025

Uh oh!

lukel97 left a comment •

edited

Loading

Uh oh!

Uh oh!

lukel97 left a comment

Uh oh!

fhahn left a comment

Uh oh!

lukel97 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem #154072

Are you sure you want to change the base?

[LV][EVL] Replace VPInstruction::Select with vp.merge for predicated div/rem #154072

Uh oh!

Conversation

Mel-Chen commented Aug 18, 2025

Uh oh!

llvmbot commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukel97 commented Aug 18, 2025

Uh oh!

Mel-Chen commented Aug 18, 2025

Uh oh!

Mel-Chen commented Sep 9, 2025

Uh oh!

lukel97 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Aug 18, 2025 •

edited

Loading

github-actions bot commented Aug 18, 2025 •

edited

Loading

lukel97 left a comment •

edited

Loading