Skip to content

Conversation

@broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Oct 6, 2025

This is a follow up patch from #161764

This

  1. revert an error in previous patch that the v_mov_b16_t16_e32 should still select idx 1 while only e64 should select idx 2 (this is cosmetic since this error won't be triggered, but still it's an error)
  2. revert the utility function added in previous patch
  3. added a src modifier check in isFoldCopyable check for v_mov_b16_t16_e64 (we never set the src mod so far, but it's a correct change)

@broxigarchen broxigarchen marked this pull request as ready for review October 6, 2025 15:00
@broxigarchen broxigarchen requested review from Sisyph and arsenm October 6, 2025 15:00
@llvmbot
Copy link
Member

llvmbot commented Oct 6, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Brox Chen (broxigarchen)

Changes

This is a follow up patch from #161764

This

  1. fix a error that the v_mov_b16_t16_e32 should still select idx 0
  2. remove the utility function added in previous patch
  3. added a src modifier check in isFoldCopyable check for v_mov_b16_t16_e64

Full diff: https://github.com/llvm/llvm-project/pull/162101.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIFoldOperands.cpp (+1-1)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+2-27)
  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (-1)
diff --git a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
index 90c828ba8dfab..5d34cdbbf20c5 100644
--- a/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+++ b/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
@@ -932,7 +932,7 @@ static MachineOperand *lookUpCopyChain(const SIInstrInfo &TII,
   for (MachineInstr *SubDef = MRI.getVRegDef(SrcReg);
        SubDef && TII.isFoldableCopy(*SubDef);
        SubDef = MRI.getVRegDef(Sub->getReg())) {
-    unsigned SrcIdx = TII.getFoldableCopySrcIdx(*SubDef);
+    const int SrcIdx = MovOp == AMDGPU::V_MOV_B16_t16_e64 ? 2 : 1;
     MachineOperand &SrcOp = SubDef->getOperand(SrcIdx);
 
     if (SrcOp.isImm())
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 46757cf5fe90c..b411648af2255 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -3411,7 +3411,6 @@ void SIInstrInfo::insertSelect(MachineBasicBlock &MBB,
 bool SIInstrInfo::isFoldableCopy(const MachineInstr &MI) {
   switch (MI.getOpcode()) {
   case AMDGPU::V_MOV_B16_t16_e32:
-  case AMDGPU::V_MOV_B16_t16_e64:
   case AMDGPU::V_MOV_B32_e32:
   case AMDGPU::V_MOV_B32_e64:
   case AMDGPU::V_MOV_B64_PSEUDO:
@@ -3428,34 +3427,10 @@ bool SIInstrInfo::isFoldableCopy(const MachineInstr &MI) {
   case AMDGPU::AV_MOV_B32_IMM_PSEUDO:
   case AMDGPU::AV_MOV_B64_IMM_PSEUDO:
     return true;
-  default:
-    return false;
-  }
-}
-
-unsigned SIInstrInfo::getFoldableCopySrcIdx(const MachineInstr &MI) {
-  switch (MI.getOpcode()) {
-  case AMDGPU::V_MOV_B16_t16_e32:
   case AMDGPU::V_MOV_B16_t16_e64:
-    return 2;
-  case AMDGPU::V_MOV_B32_e32:
-  case AMDGPU::V_MOV_B32_e64:
-  case AMDGPU::V_MOV_B64_PSEUDO:
-  case AMDGPU::V_MOV_B64_e32:
-  case AMDGPU::V_MOV_B64_e64:
-  case AMDGPU::S_MOV_B32:
-  case AMDGPU::S_MOV_B64:
-  case AMDGPU::S_MOV_B64_IMM_PSEUDO:
-  case AMDGPU::COPY:
-  case AMDGPU::WWM_COPY:
-  case AMDGPU::V_ACCVGPR_WRITE_B32_e64:
-  case AMDGPU::V_ACCVGPR_READ_B32_e64:
-  case AMDGPU::V_ACCVGPR_MOV_B32:
-  case AMDGPU::AV_MOV_B32_IMM_PSEUDO:
-  case AMDGPU::AV_MOV_B64_IMM_PSEUDO:
-    return 1;
+    return !TII->hasAnyModifiersSet(MI);
   default:
-    llvm_unreachable("MI is not a foldable copy");
+    return false;
   }
 }
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index cc59acf1ebd94..a21089f8e0fcc 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -417,7 +417,6 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
                                   const MachineInstr &MIb) const override;
 
   static bool isFoldableCopy(const MachineInstr &MI);
-  static unsigned getFoldableCopySrcIdx(const MachineInstr &MI);
 
   void removeModOperands(MachineInstr &MI) const;
 

@broxigarchen broxigarchen force-pushed the main-fix-true16-si-fold-1 branch 2 times, most recently from 1eb464f to 37ab639 Compare October 6, 2025 20:43
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test changes

@broxigarchen
Copy link
Contributor Author

broxigarchen commented Oct 7, 2025

Missing test changes

This patch is reverting some of the changes in #161764 so I think we don't need extra test here? The ir test added in the previous patch should cover it already

@broxigarchen
Copy link
Contributor Author

ping!

SubDef && TII.isFoldableCopy(*SubDef);
SubDef = MRI.getVRegDef(Sub->getReg())) {
unsigned SrcIdx = TII.getFoldableCopySrcIdx(*SubDef);
const int SrcIdx =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some unit-tests to make sure this is working as expected ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but according to point 1 in the description there was an error which is apparently not covered

Copy link
Contributor Author

@broxigarchen broxigarchen Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick check, actually at this stage we always create a e64 inst, and actually cannot create a e32 inst since the dest reg of v_mov_b16_t16_e32 is not allocatable. Thus the point 1 in the description is more cosmetic

@broxigarchen broxigarchen requested a review from arsenm October 8, 2025 16:31
@@ -3990,8 +3966,8 @@ static bool getFoldableImm(Register Reg, const MachineRegisterInfo &MRI,
return false;
}

static bool getFoldableImm(const MachineOperand *MO, int64_t &Imm,
MachineInstr **DefMI = nullptr) {
bool SIInstrInfo::getFoldableImm(const MachineOperand *MO, int64_t &Imm,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can make hasAnyModifiers static instead

Copy link
Contributor Author

@broxigarchen broxigarchen Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will leads to a chain of functions to be converted to static so I decided to do the opposite way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Matt, it's better to make more methods static if that works.

Copy link
Contributor Author

@broxigarchen broxigarchen Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Just updated to the other way of converting more static function

@broxigarchen broxigarchen force-pushed the main-fix-true16-si-fold-1 branch from 37ab639 to 63780c2 Compare October 14, 2025 18:38
@broxigarchen broxigarchen requested a review from arsenm October 14, 2025 18:42
@broxigarchen
Copy link
Contributor Author

ping!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants