-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[AArch64] Removed redundant FMOV instruction for truncstores of f64/f32 via bitcast to i64/i32/i8. #149997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-backend-aarch64 Author: Amina Chabane (Amichaxx) ChangesPreviously, storing the low bits of a double, which was bitcast to i64 and truncated to i32 or i16, would emit a redundant FMOV. This patch introduces new TableGen patterns to avoid the unnecessary FMOV. Tests added: bitcast_truncstore.ll Full diff: https://github.com/llvm/llvm-project/pull/149997.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 0cb7b02d84a6e..aa635b188da70 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -4649,6 +4649,14 @@ let Predicates = [IsLE] in {
(STRQui FPR128:$Rt, GPR64sp:$Rn, uimm12s16:$offset)>;
}
+// truncstorei32 of f64 bitcasted to i64
+def : Pat<(truncstorei32 (i64 (bitconvert (f64 FPR64:$Rt))), (am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
+ (STRSui (EXTRACT_SUBREG FPR64:$Rt, ssub), GPR64sp:$Rn, uimm12s4:$offset)>;
+
+// truncstorei16 of f64 bitcasted to i64
+def : Pat<(truncstorei16 (i64 (bitconvert (f64 FPR64:$Rt))), (am_indexed16 GPR64sp:$Rn, uimm12s2:$offset)),
+ (STRHui (f16 (EXTRACT_SUBREG FPR64:$Rt, hsub)), GPR64sp:$Rn, uimm12s2:$offset)>;
+
// truncstore i64
def : Pat<(truncstorei32 GPR64:$Rt,
(am_indexed32 GPR64sp:$Rn, uimm12s4:$offset)),
diff --git a/llvm/test/CodeGen/AArch64/bitcast_truncstore.ll b/llvm/test/CodeGen/AArch64/bitcast_truncstore.ll
new file mode 100644
index 0000000000000..8e0d0c2158090
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/bitcast_truncstore.ll
@@ -0,0 +1,26 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=aarch64-linux-gnu -o - %s | FileCheck %s
+
+define void @_Z10store_i64_from_f64Pjd(ptr %n, double noundef %x){
+; CHECK-LABEL: _Z10store_i64_from_f64Pjd:
+; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: str s0, [x0]
+; CHECK-NEXT: ret
+entry:
+ %0 = bitcast double %x to i64
+ %conv = trunc i64 %0 to i32
+ store i32 %conv, ptr %n, align 4
+ ret void
+}
+
+define void @_Z9store_i16Ptd(ptr %n, double noundef %x) {
+; CHECK-LABEL: _Z9store_i16Ptd:
+; CHECK: // %bb.0: // %entry
+; CHECK-NEXT: str h0, [x0]
+; CHECK-NEXT: ret
+entry:
+ %0 = bitcast double %x to i64
+ %conv = trunc i64 %0 to i16
+ store i16 %conv, ptr %n, align 2
+ ret void
+}
|
|
Can we add patterns and tests for the other types, similar to #146920? |
f0b3edf to
aec8bb3
Compare
… (f64/f32 → i32/i16/i8X)
aec8bb3 to
97f61b4
Compare
CarolineConcatto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Amina,
CarolineConcatto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Amina,
The patch looks good.
Can you align the stores before merging the patch please.
|
@davemgreen Hi, just wondering if the changes look okay to you? Thanks. |
davemgreen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, LGTM. Do you want me to hit submit?
|
Yes please :) |
|
@Amichaxx Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR. Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues. How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
Previously, storing the low bits of a double, which was bitcast to i64 and truncated to i32 or i16, would emit a redundant FMOV. This patch introduces new TableGen patterns to avoid the unnecessary FMOV. Tests added: bitcast_truncstore.ll