-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AArch64] Fix zero-register copying with zero-cycle moves #154362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fix incorrect super-register lookup when copying from $wzr on subtargets that lack zero-cycle zeroing but support 64-bit zero-cycle moves. When copying from $wzr, we used the wrong register class to lookup the super-register, causing $w0 = COPY $wzr to get expanded as $x0 = ORRXrr $xzr, undef $noreg, implicit $wzr, rather than the correct $x0 = ORRXrr $xzr, undef $xzr, implicit $wzr.
|
@llvm/pr-subscribers-backend-aarch64 Author: David Tellenbach (dtellenbach) ChangesFix incorrect super-register lookup when copying from $wzr on subtargets that lack zero-cycle zeroing but support 64-bit zero-cycle moves. When copying from $wzr, we used the wrong register class to lookup the super-register, causing $w0 = COPY $wzr to get expanded as $x0 = ORRXrr $xzr, undef $noreg, implicit $wzr, rather than the correct $x0 = ORRXrr $xzr, undef $xzr, implicit $wzr. Full diff: https://github.com/llvm/llvm-project/pull/154362.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index c987a2950f152..d15f90deba74e 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -5085,8 +5085,13 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
// Cyclone recognizes "ORR Xd, XZR, Xm" as a zero-cycle register move.
MCRegister DestRegX = TRI->getMatchingSuperReg(
DestReg, AArch64::sub_32, &AArch64::GPR64spRegClass);
- MCRegister SrcRegX = TRI->getMatchingSuperReg(
- SrcReg, AArch64::sub_32, &AArch64::GPR64spRegClass);
+ assert(DestRegX.isValid() && "Destination super-reg not valid");
+ MCRegister SrcRegX =
+ SrcReg == AArch64::WZR
+ ? AArch64::XZR
+ : TRI->getMatchingSuperReg(SrcReg, AArch64::sub_32,
+ &AArch64::GPR64spRegClass);
+ assert(SrcRegX.isValid() && "Source super-reg not valid");
// This instruction is reading and writing X registers. This may upset
// the register scavenger and machine verifier, so we need to indicate
// that we are reading an undefined value from SrcRegX, but a proper
diff --git a/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir b/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir
new file mode 100644
index 0000000000000..76b5b76130657
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/arm64-copy-phys-zero-reg.mir
@@ -0,0 +1,109 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="-zcm-gpr32,-zcm-gpr64,-zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ %s
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="+zcm-gpr32,-zcm-gpr64,-zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ %s
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="-zcm-gpr32,+zcm-gpr64,-zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ %s
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="+zcm-gpr32,+zcm-gpr64,-zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ %s
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="-zcm-gpr32,-zcm-gpr64,+zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-NO-ZCM-ZCZ %s
+# RUN: llc -o - -mtriple=arm64-apple-ios -run-pass=postrapseudos -simplify-mir -verify-machineinstrs -mattr="+zcm-gpr32,+zcm-gpr64,+zcz" %s \
+# RUN: | FileCheck --check-prefix=CHECK-ZCM-ZCZ %s
+
+--- |
+ define void @f0(i64 noundef %x) { ret void }
+ define void @f1(i64 noundef %x) { ret void }
+ define void @f2(i32 noundef) { ret void }
+...
+---
+name: f0
+liveins:
+ - { reg: '$x0' }
+body: |
+ bb.0:
+ liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-LABEL: name: f0
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: $w0 = ORRWrr $wzr, $wzr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-LABEL: name: f0
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: $w0 = ORRWrr $wzr, $wzr
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-LABEL: name: f0
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: $x0 = ORRXrr $xzr, undef $xzr, implicit $wzr
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-LABEL: name: f0
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: $w0 = ORRWrr $wzr, $wzr
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-NO-ZCM-ZCZ-LABEL: name: f0
+ ; CHECK-NO-ZCM-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-ZCZ-NEXT: $w0 = MOVZWi 0, 0
+ ; CHECK-NO-ZCM-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-ZCZ-LABEL: name: f0
+ ; CHECK-ZCM-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-ZCZ-NEXT: $w0 = MOVZWi 0, 0
+ ; CHECK-ZCM-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ $w0 = COPY $wzr
+ BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+...
+---
+name: f1
+liveins:
+ - { reg: '$x0' }
+body: |
+ bb.0:
+ liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-LABEL: name: f1
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: $x0 = ORRXrr $xzr, $xzr
+ ; CHECK-NO-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-LABEL: name: f1
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: $x0 = ORRXrr $xzr, $xzr
+ ; CHECK-ZCM-GPR32-NO-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-LABEL: name: f1
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: $x0 = ORRXrr $xzr, $xzr
+ ; CHECK-NO-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-LABEL: name: f1
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: $x0 = ORRXrr $xzr, $xzr
+ ; CHECK-ZCM-GPR32-ZCM-GPR64-NO-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-NO-ZCM-ZCZ-LABEL: name: f1
+ ; CHECK-NO-ZCM-ZCZ: liveins: $x0, $lr
+ ; CHECK-NO-ZCM-ZCZ-NEXT: {{ $}}
+ ; CHECK-NO-ZCM-ZCZ-NEXT: $x0 = MOVZXi 0, 0
+ ; CHECK-NO-ZCM-ZCZ-NEXT: BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ ;
+ ; CHECK-ZCM-ZCZ-LABEL: name: f1
+ ; CHECK-ZCM-ZCZ: liveins: $x0, $lr
+ ; CHECK-ZCM-ZCZ-NEXT: {{ $}}
+ ; CHECK-ZCM-ZCZ-NEXT: $x0 = MOVZXi 0, 0
+ ; CHECK-ZCM-ZCZ-NEXT:BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+ $x0 = COPY $xzr
+ BL @f2, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit $w0, implicit-def $sp, implicit-def $w0
+...
|
davemgreen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The undef seems odd and it could avoid the implicit, but SGTM as a fix.
tomershafir
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
| @@ -0,0 +1,109 @@ | |||
| # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't the test miss 2 cases +zcm-gpr32,-zcm-gpr64,+zcz and -zcm-gpr32,+zcm-gpr64,+zcz?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but that's not exercising any other control-flow which because of +zcz.
| SrcReg, AArch64::sub_32, &AArch64::GPR64spRegClass); | ||
| assert(DestRegX.isValid() && "Destination super-reg not valid"); | ||
| MCRegister SrcRegX = | ||
| SrcReg == AArch64::WZR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add if (SrcReg == AArch64::WZR) and have 2 ways to build the MI to avoid undef and implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need to check if that's really okay or causes any problems. There is a comment by Tim Northover about using undef to help register allocation so we shouldn't blindly remove it.
Following Dave's reasoning here that this is good enough to fix the real issue.
Fix incorrect super-register lookup when copying from $wzr on subtargets
that lack zero-cycle zeroing but support 64-bit zero-cycle moves.
When copying from $wzr, we used the wrong register class to lookup the
super-register, causing
$w0 = COPY $wzr
to get expanded as
$x0 = ORRXrr $xzr, undef $noreg, implicit $wzr,
rather than the correct
$x0 = ORRXrr $xzr, undef $xzr, implicit $wzr.
(cherry-pick b51486f88a7f90719a1e744c828738e717ba5ffc)
Fix incorrect super-register lookup when copying from $wzr on subtargets that lack zero-cycle zeroing but support 64-bit zero-cycle moves.
When copying from $wzr, we used the wrong register class to lookup the super-register, causing
to get expanded as
rather than the correct