-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] Disable dpp src1 sgpr on gfx11 #164241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-backend-amdgpu Author: Paul Trojahn (ptrojahn) Changes#67461 enabled SGPRs as src1 by default for all opcodes with manual checks for targets where this is not supported. #155595 disabled this check for hasDPPSrc1SGPR, which resulted in SGPRs being used as operands for src1 on gfx11. This PR reenables this check and fixes the lit test. Full diff: https://github.com/llvm/llvm-project/pull/164241.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp b/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
index 464cbec6c46bc..e394096b9fe09 100644
--- a/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp
@@ -307,10 +307,12 @@ MachineInstr *GCNDPPCombine::createDPPInst(MachineInstr &OrigMI,
if (Src1) {
assert(AMDGPU::hasNamedOperand(DPPOp, AMDGPU::OpName::src1) &&
"dpp version of instruction missing src1");
- // If subtarget does not support SGPRs for src1 operand then the
- // requirements are the same as for src0. We check src0 instead because
- // pseudos are shared between subtargets and allow SGPR for src1 on all.
if (!ST->hasDPPSrc1SGPR()) {
+ if (Src1->isReg() && ST->getRegisterInfo()->isSGPRClass(
+ MRI->getRegClass(Src1->getReg()))) {
+ Fail = true;
+ break;
+ }
assert(TII->getOpSize(*DPPInst, Src0Idx) ==
TII->getOpSize(*DPPInst, NumOperands) &&
"Src0 and Src1 operands should have the same size");
diff --git a/llvm/test/CodeGen/AMDGPU/dpp_combine_gfx11.mir b/llvm/test/CodeGen/AMDGPU/dpp_combine_gfx11.mir
index 3725384e885ee..87ec1eabc2033 100644
--- a/llvm/test/CodeGen/AMDGPU/dpp_combine_gfx11.mir
+++ b/llvm/test/CodeGen/AMDGPU/dpp_combine_gfx11.mir
@@ -1,6 +1,6 @@
-# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN
-# RUN: llc -mtriple=amdgcn -mcpu=gfx1150 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN
-# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1100 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN,GFX1100
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1150 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN,GFX1150
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -mattr=-real-true16 -run-pass=gcn-dpp-combine -verify-machineinstrs -o - %s | FileCheck %s -check-prefixes=GCN,GFX1150
---
@@ -38,8 +38,10 @@ body: |
# GCN-LABEL: name: vop3_sgpr_src1
# GCN: %6:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, %1, 0, %2, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
-# GCN: %8:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, %2, 0, %1, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
-# GCN: %10:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, %2, 0, %3, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
+# GFX1100: %8:vgpr_32 = V_MED3_F32_e64 0, %7, 0, %2, 0, %1, 0, 0, implicit $mode, implicit $exec
+# GFX1150: %8:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, %2, 0, %1, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
+# GFX1100: %10:vgpr_32 = V_MED3_F32_e64 0, %9, 0, %2, 0, %3, 0, 0, implicit $mode, implicit $exec
+# GFX1150: %10:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, %2, 0, %3, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
# GCN: %12:vgpr_32 = V_MED3_F32_e64_dpp %4, 0, %0, 0, 42, 0, %2, 0, 0, 1, 15, 15, 1, implicit $mode, implicit $exec
# GCN: %14:vgpr_32 = V_MED3_F32_e64 0, %13, 0, 4242, 0, %2, 0, 0, implicit $mode, implicit $exec
name: vop3_sgpr_src1
|
llvm#67461 enabled SGPRs as src1 by default for all opcodes with manual checks for targets where this is not supported. llvm#155595 disabled this check for hasDPPSrc1SGPR, which resulted in SGPRs being used as operands on gfx11. This PR reenables this check and fixes the corresponding lit test.
5fbc114 to
33d12e6
Compare
arsenm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include an IR test that hits this problem? I also don't see this case handled in isOperandLegal
| // If subtarget does not support SGPRs for src1 operand then the | ||
| // requirements are the same as for src0. We check src0 instead because | ||
| // pseudos are shared between subtargets and allow SGPR for src1 on all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve this comment? Also, can we start using RegClassByHwMode to avoid this problem?
#67461 enabled SGPRs as src1 by default for all dpp opcodes with manual checks for targets where this is not supported. In that case, isOperandLegal checked if the second operand is legal as src0. #155595 disabled this check by removing the calls to isOperandLegal, which resulted in SGPRs being used as operands for src1 on gfx11. This PR reenables this check and fixes the lit test.