[RISCV] Check only demanded VTYPE fields in needVSETVLIPHI #90168

lukel97 · 2024-04-26T05:48:29Z

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded.

This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same .

(We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded. This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same . (We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)

llvmbot · 2024-04-26T05:48:58Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded.

This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same .

(We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)

Full diff: https://github.com/llvm/llvm-project/pull/90168.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+14-8)
(modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll (-1)

diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index c40b9031543fe2..60210b1d6b92bc 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -808,8 +808,8 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
 private:
   bool needVSETVLI(const MachineInstr &MI, const VSETVLIInfo &Require,
                    const VSETVLIInfo &CurInfo) const;
-  bool needVSETVLIPHI(const VSETVLIInfo &Require,
-                      const MachineBasicBlock &MBB) const;
+  bool needVSETVLIPHI(const VSETVLIInfo &Require, const MachineBasicBlock &MBB,
+                      const DemandedFields &Used) const;
   void insertVSETVLI(MachineBasicBlock &MBB, MachineInstr &MI,
                      const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
   void insertVSETVLI(MachineBasicBlock &MBB,
@@ -1318,7 +1318,8 @@ void RISCVInsertVSETVLI::computeIncomingVLVTYPE(const MachineBasicBlock &MBB) {
 // be unneeded if the AVL is a phi node where all incoming values are VL
 // outputs from the last VSETVLI in their respective basic blocks.
 bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
-                                        const MachineBasicBlock &MBB) const {
+                                        const MachineBasicBlock &MBB,
+                                        const DemandedFields &Used) const {
   if (DisableInsertVSETVLPHIOpt)
     return true;
 
@@ -1350,10 +1351,14 @@ bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
     if (DefInfo != PBBExit)
       return true;
 
-    // Require has the same VL as PBBExit, so if the exit from the
-    // predecessor has the VTYPE we are looking for we might be able
-    // to avoid a VSETVLI.
-    if (PBBExit.isUnknown() || !PBBExit.hasSameVTYPE(Require))
+    // Make sure Require has the same VLMAX as the predecessor block,
+    // otherwise the VL might be different.
+    if (PBBExit.isUnknown() || !PBBExit.hasSameVLMAX(Require))
+      return true;
+
+    // If the exit from the predecessor has the VTYPE we are looking
+    // for we might be able to avoid a VSETVLI.
+    if (!PBBExit.hasCompatibleVTYPE(Used, Require))
       return true;
   }
 
@@ -1392,7 +1397,8 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
         // wouldn't be used and VL/VTYPE registers are correct.  Note that
         // we *do* need to model the state as if it changed as while the
         // register contents are unchanged, the abstract model can change.
-        if (!PrefixTransparent || needVSETVLIPHI(CurInfo, MBB))
+        if (!PrefixTransparent ||
+            needVSETVLIPHI(CurInfo, MBB, getDemanded(MI, MRI, ST)))
           insertVSETVLI(MBB, MI, CurInfo, PrevInfo);
         PrefixTransparent = false;
       }
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index 4ff2fc7a5fff5d..f113f746dec297 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -494,7 +494,6 @@ define void @saxpy_vec_demanded_fields(i64 %n, float %a, ptr nocapture readonly
 ; CHECK-NEXT:    beqz a3, .LBB9_2
 ; CHECK-NEXT:  .LBB9_1: # %for.body
 ; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
-; CHECK-NEXT:    vsetvli zero, a3, e32, m8, ta, ma
 ; CHECK-NEXT:    vle32.v v8, (a1)
 ; CHECK-NEXT:    vle32.v v16, (a2)
 ; CHECK-NEXT:    slli a4, a3, 2

preames

This is unsound because the CurInfo state during rewriting will differ from the CurInfo computed during the forward data flow. The first instruction may not care, but later ones in the block (or following blocks!) may depend on the other fields.

There is a correct variant of this, but you end of with different states flowing through the block during emission, and the need to compensate. Not sure if it's worth the complexity?

lukel97 · 2024-05-22T06:54:25Z

This is unsound because the CurInfo state during rewriting will differ from the CurInfo computed during the forward data flow. The first instruction may not care, but later ones in the block (or following blocks!) may depend on the other fields.

There is a correct variant of this, but you end of with different states flowing through the block during emission, and the need to compensate. Not sure if it's worth the complexity?

Woops, agreeed I don't think its worth the complexity. The original motivation for this was to split out a change from #89089 needed to avoid a regression, but since RISCVInsertVSETVLI has changed plenty in the meantime I'll check if this is still actually needed first.

lukel97 · 2024-06-25T06:53:02Z

Closing as I think there should be a better way to avoid the saxpy_vec regression in #89089. I'm not sure what it is yet though

lukel97 requested review from BeMg, preames, topperc and wangpc-pp April 26, 2024 05:48

llvmbot added the backend:RISC-V label Apr 26, 2024

preames requested changes May 21, 2024

View reviewed changes

lukel97 closed this Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Check only demanded VTYPE fields in needVSETVLIPHI #90168

[RISCV] Check only demanded VTYPE fields in needVSETVLIPHI #90168

Uh oh!

lukel97 commented Apr 26, 2024

Uh oh!

llvmbot commented Apr 26, 2024

Uh oh!

preames left a comment

Uh oh!

lukel97 commented May 22, 2024

Uh oh!

lukel97 commented Jun 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[RISCV] Check only demanded VTYPE fields in needVSETVLIPHI #90168

[RISCV] Check only demanded VTYPE fields in needVSETVLIPHI #90168

Uh oh!

Conversation

lukel97 commented Apr 26, 2024

Uh oh!

llvmbot commented Apr 26, 2024

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

lukel97 commented May 22, 2024

Uh oh!

lukel97 commented Jun 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants