Skip to content

Conversation

@arsenm
Copy link
Contributor

@arsenm arsenm commented Aug 11, 2025

Handle a special case for copies from AGPR VGPR on the MFMA inputs.
If the "input" is really a subregister def, we will not see the
usual copy to VGPR for src2, only the read of the subregister def.
Not sure if this pattern appears in practice.

@llvmbot
Copy link
Member

llvmbot commented Aug 11, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Handle a special case for copies from AGPR VGPR on the MFMA inputs.
If the "input" is really a subregister def, we will not see the
usual copy to VGPR for src2, only the read of the subregister def.
Not sure if this pattern appears in practice.


Full diff: https://github.com/llvm/llvm-project/pull/153023.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp (+6-5)
  • (modified) llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir (+2-2)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
index b71c70db5e6b3..4e0d64a20690e 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
@@ -375,13 +375,14 @@ bool AMDGPURewriteAGPRCopyMFMAImpl::tryFoldCopiesFromAGPR(
     Register CopyDstReg = UseMI.getOperand(0).getReg();
     if (!CopyDstReg.isVirtual())
       continue;
+    for (MachineOperand &CopyUseMO : MRI.reg_nodbg_operands(CopyDstReg)) {
+      if (!CopyUseMO.readsReg())
+        continue;
 
-    for (MachineInstr &CopyUseMI : MRI.use_instructions(CopyDstReg)) {
+      MachineInstr &CopyUseMI = *CopyUseMO.getParent();
       if (isRewriteCandidate(CopyUseMI)) {
-        const MachineOperand *Op =
-            CopyUseMI.findRegisterUseOperand(CopyDstReg, /*TRI=*/nullptr);
-        if (tryReassigningMFMAChain(CopyUseMI, Op->getOperandNo(),
-                                    VRM.getPhys(Op->getReg())))
+        if (tryReassigningMFMAChain(CopyUseMI, CopyUseMO.getOperandNo(),
+                                    VRM.getPhys(CopyUseMO.getReg())))
           MadeChange = true;
       }
     }
diff --git a/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir b/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir
index 632401b6128c5..17a72110767bb 100644
--- a/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir
+++ b/llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir
@@ -187,8 +187,8 @@ body:             |
     ; CHECK-NEXT: [[COPY1:%[0-9]+]]:av_64_align2 = COPY $vgpr0_vgpr1
     ; CHECK-NEXT: [[COPY2:%[0-9]+]]:av_64_align2 = COPY $vgpr2_vgpr3
     ; CHECK-NEXT: [[GLOBAL_LOAD_DWORDX4_:%[0-9]+]]:areg_128_align2 = GLOBAL_LOAD_DWORDX4 [[COPY]], 0, 0, implicit $exec :: (load (s128), addrspace 1)
-    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:vreg_128_align2 = COPY [[GLOBAL_LOAD_DWORDX4_]]
-    ; CHECK-NEXT: [[COPY3:%[0-9]+]].sub0_sub1:vreg_128_align2 = V_MFMA_F64_4X4X4F64_vgprcd_e64 [[COPY1]], [[COPY2]], 0, 0, 0, 0, implicit $mode, implicit $exec
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]]:areg_128_align2 = COPY [[GLOBAL_LOAD_DWORDX4_]]
+    ; CHECK-NEXT: [[COPY3:%[0-9]+]].sub0_sub1:areg_128_align2 = V_MFMA_F64_4X4X4F64_e64 [[COPY1]], [[COPY2]], 0, 0, 0, 0, implicit $mode, implicit $exec
     ; CHECK-NEXT: GLOBAL_STORE_DWORDX4 [[COPY]], [[COPY3]], 0, 0, implicit $exec :: (store (s128), addrspace 1)
     ; CHECK-NEXT: SI_RETURN
     %0:vreg_64_align2 = COPY $vgpr4_vgpr5

Copy link
Contributor

@perlfu perlfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from 3bd59fe to c73ac5e Compare August 18, 2025 15:31
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch from 5d234cc to 002114a Compare August 18, 2025 15:31
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from c73ac5e to 90d2381 Compare August 20, 2025 23:23
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch from 002114a to 87bc565 Compare August 20, 2025 23:23
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from 90d2381 to 22d2495 Compare August 21, 2025 00:11
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch from 87bc565 to 8a87d16 Compare August 21, 2025 00:11
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from 22d2495 to 8735fbf Compare August 21, 2025 00:42
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch 2 times, most recently from 2a2778f to f2932c5 Compare August 21, 2025 01:41
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from 8735fbf to db5f240 Compare August 21, 2025 01:41
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch from f2932c5 to 5d8dc9b Compare August 21, 2025 13:43
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch 2 times, most recently from 579e971 to be46142 Compare August 28, 2025 04:15
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-mfma-copy-from-agpr branch from 5d8dc9b to 968135b Compare August 28, 2025 04:15
Base automatically changed from users/arsenm/amdgpu/handle-mfma-copy-from-agpr to main September 3, 2025 05:12
Previously we handled the inverse situation only.
Handle a special case for copies from AGPR VGPR on the MFMA inputs.
If the "input" is really a subregister def, we will not see the
usual copy to VGPR for src2, only the read of the subregister def.
Not sure if this pattern appears in practice.
@arsenm arsenm force-pushed the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch from be46142 to d7f037a Compare September 3, 2025 06:40
@arsenm arsenm enabled auto-merge (squash) September 3, 2025 06:41
@arsenm arsenm disabled auto-merge September 3, 2025 07:21
@arsenm arsenm merged commit da8f692 into main Sep 3, 2025
8 of 9 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu/handle-subreg-def-read-mfma-copy-from-agpr branch September 3, 2025 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants