Skip to content

Conversation

@lukel97
Copy link
Contributor

@lukel97 lukel97 commented Apr 26, 2024

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded.

This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same .

(We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded.

This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same .

(We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)
@llvmbot
Copy link
Member

llvmbot commented Apr 26, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

In needVSETVLIPHI we know that the VLs will be the same if the AVL is the output VL of another vsetvli and they have the same VLMAX, so we just check that the VTYPEs are the same. But we don't need to check all the fields if they're not demanded.

This allows us to avoid a vsetvli in some cases where e.g. the VLMAX ratio is the same .

(We still need the VLMAXes to be the same to make sure the VLs are the same, but we could potentially relax this to allow a smaller VLMAX where there's no risk of truncation)


Full diff: https://github.com/llvm/llvm-project/pull/90168.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp (+14-8)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll (-1)
diff --git a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
index c40b9031543fe2..60210b1d6b92bc 100644
--- a/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
@@ -808,8 +808,8 @@ class RISCVInsertVSETVLI : public MachineFunctionPass {
 private:
   bool needVSETVLI(const MachineInstr &MI, const VSETVLIInfo &Require,
                    const VSETVLIInfo &CurInfo) const;
-  bool needVSETVLIPHI(const VSETVLIInfo &Require,
-                      const MachineBasicBlock &MBB) const;
+  bool needVSETVLIPHI(const VSETVLIInfo &Require, const MachineBasicBlock &MBB,
+                      const DemandedFields &Used) const;
   void insertVSETVLI(MachineBasicBlock &MBB, MachineInstr &MI,
                      const VSETVLIInfo &Info, const VSETVLIInfo &PrevInfo);
   void insertVSETVLI(MachineBasicBlock &MBB,
@@ -1318,7 +1318,8 @@ void RISCVInsertVSETVLI::computeIncomingVLVTYPE(const MachineBasicBlock &MBB) {
 // be unneeded if the AVL is a phi node where all incoming values are VL
 // outputs from the last VSETVLI in their respective basic blocks.
 bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
-                                        const MachineBasicBlock &MBB) const {
+                                        const MachineBasicBlock &MBB,
+                                        const DemandedFields &Used) const {
   if (DisableInsertVSETVLPHIOpt)
     return true;
 
@@ -1350,10 +1351,14 @@ bool RISCVInsertVSETVLI::needVSETVLIPHI(const VSETVLIInfo &Require,
     if (DefInfo != PBBExit)
       return true;
 
-    // Require has the same VL as PBBExit, so if the exit from the
-    // predecessor has the VTYPE we are looking for we might be able
-    // to avoid a VSETVLI.
-    if (PBBExit.isUnknown() || !PBBExit.hasSameVTYPE(Require))
+    // Make sure Require has the same VLMAX as the predecessor block,
+    // otherwise the VL might be different.
+    if (PBBExit.isUnknown() || !PBBExit.hasSameVLMAX(Require))
+      return true;
+
+    // If the exit from the predecessor has the VTYPE we are looking
+    // for we might be able to avoid a VSETVLI.
+    if (!PBBExit.hasCompatibleVTYPE(Used, Require))
       return true;
   }
 
@@ -1392,7 +1397,8 @@ void RISCVInsertVSETVLI::emitVSETVLIs(MachineBasicBlock &MBB) {
         // wouldn't be used and VL/VTYPE registers are correct.  Note that
         // we *do* need to model the state as if it changed as while the
         // register contents are unchanged, the abstract model can change.
-        if (!PrefixTransparent || needVSETVLIPHI(CurInfo, MBB))
+        if (!PrefixTransparent ||
+            needVSETVLIPHI(CurInfo, MBB, getDemanded(MI, MRI, ST)))
           insertVSETVLI(MBB, MI, CurInfo, PrevInfo);
         PrefixTransparent = false;
       }
diff --git a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
index 4ff2fc7a5fff5d..f113f746dec297 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vsetvli-insert-crossbb.ll
@@ -494,7 +494,6 @@ define void @saxpy_vec_demanded_fields(i64 %n, float %a, ptr nocapture readonly
 ; CHECK-NEXT:    beqz a3, .LBB9_2
 ; CHECK-NEXT:  .LBB9_1: # %for.body
 ; CHECK-NEXT:    # =>This Inner Loop Header: Depth=1
-; CHECK-NEXT:    vsetvli zero, a3, e32, m8, ta, ma
 ; CHECK-NEXT:    vle32.v v8, (a1)
 ; CHECK-NEXT:    vle32.v v16, (a2)
 ; CHECK-NEXT:    slli a4, a3, 2

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unsound because the CurInfo state during rewriting will differ from the CurInfo computed during the forward data flow. The first instruction may not care, but later ones in the block (or following blocks!) may depend on the other fields.

There is a correct variant of this, but you end of with different states flowing through the block during emission, and the need to compensate. Not sure if it's worth the complexity?

@lukel97
Copy link
Contributor Author

lukel97 commented May 22, 2024

This is unsound because the CurInfo state during rewriting will differ from the CurInfo computed during the forward data flow. The first instruction may not care, but later ones in the block (or following blocks!) may depend on the other fields.

There is a correct variant of this, but you end of with different states flowing through the block during emission, and the need to compensate. Not sure if it's worth the complexity?

Woops, agreeed I don't think its worth the complexity. The original motivation for this was to split out a change from #89089 needed to avoid a regression, but since RISCVInsertVSETVLI has changed plenty in the meantime I'll check if this is still actually needed first.

@lukel97
Copy link
Contributor Author

lukel97 commented Jun 25, 2024

Closing as I think there should be a better way to avoid the saxpy_vec regression in #89089. I'm not sure what it is yet though

@lukel97 lukel97 closed this Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants