Skip to content

Conversation

@MacDue
Copy link
Member

@MacDue MacDue commented Sep 16, 2024

On some targets, an FP libcall with argument types such as long double
will be lowered to pass arguments indirectly via pointers. When this is
the case we should not mark the libcall with "int" TBAA as it may lead
to incorrect optimizations.

Currently, this can be seen for long doubles on x86_64-w64-mingw32. The
load x86_fp80 after the call is (incorrectly) marked with "int" TBAA
(overwriting the previous metadata for "long double").

Nothing seems to break due to this currently as the metadata is being
incorrectly placed on the load and not the call. But if the metadata
is moved to the call (which this patch ensures), LLVM will optimize out
the setup for the arguments.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 16, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Benjamin Maxwell (MacDue)

Changes

On some targets, an FP libcall with argument types such as long double
will be lowered to pass arguments indirectly via pointers. When this is
the case we should not mark the libcall with "int" TBAA as it may lead
to incorrect optimizations.

Currently, this can be seen for long doubles on x86_64-w64-mingw32. The
load x86_fp80 after the call is (incorrectly) marked with "int" TBAA
(overwriting the previous metadata for "long double").

Nothing seems to break due to this currently as the metadata is being
incorrectly placed on the load and not the call. But if the metadata
is moved to the call (which this patch ensures), LLVM will optimize out
the setup for the arguments.


Full diff: https://github.com/llvm/llvm-project/pull/108853.diff

4 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+19-5)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+5-1)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+2-1)
  • (added) clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c (+36)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 27abeba92999b3..5730e7867a648f 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -690,23 +690,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD,
                               const CallExpr *E, llvm::Constant *calleeValue) {
   CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E);
   CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD));
+  llvm::CallBase *callOrInvoke = nullptr;
+  CGFunctionInfo const *FnInfo = nullptr;
   RValue Call =
-      CGF.EmitCall(E->getCallee()->getType(), callee, E, ReturnValueSlot());
+      CGF.EmitCall(E->getCallee()->getType(), callee, E, ReturnValueSlot(),
+                   /*Chain=*/nullptr, &callOrInvoke, &FnInfo);
 
   if (unsigned BuiltinID = FD->getBuiltinID()) {
     // Check whether a FP math builtin function, such as BI__builtin_expf
     ASTContext &Context = CGF.getContext();
     bool ConstWithoutErrnoAndExceptions =
         Context.BuiltinInfo.isConstWithoutErrnoAndExceptions(BuiltinID);
+
+    // Before annotating this libcall with "int" TBAA metadata check all
+    // arguments are passed directly. On some targets, types such as "long
+    // double" are passed indirectly via a pointer, and annotating the call with
+    // "int" TBAA metadata will lead to set up for those arguments being
+    // incorrectly optimized out.
+    bool AllArgumentsPassedDirectly =
+        llvm::all_of(FnInfo->arguments(), [&](auto const &ArgInfo) {
+          return ArgInfo.info.isDirect() || ArgInfo.info.isExtend();
+        });
+
     // Restrict to target with errno, for example, MacOS doesn't set errno.
     // TODO: Support builtin function with complex type returned, eg: cacosh
-    if (ConstWithoutErrnoAndExceptions && CGF.CGM.getLangOpts().MathErrno &&
-        !CGF.Builder.getIsFPConstrained() && Call.isScalar()) {
+    if (AllArgumentsPassedDirectly && ConstWithoutErrnoAndExceptions &&
+        CGF.CGM.getLangOpts().MathErrno && !CGF.Builder.getIsFPConstrained() &&
+        Call.isScalar()) {
       // Emit "int" TBAA metadata on FP math libcalls.
       clang::QualType IntTy = Context.IntTy;
       TBAAAccessInfo TBAAInfo = CGF.CGM.getTBAAAccessInfo(IntTy);
-      Instruction *Inst = cast<llvm::Instruction>(Call.getScalarVal());
-      CGF.CGM.DecorateInstructionWithTBAA(Inst, TBAAInfo);
+      CGF.CGM.DecorateInstructionWithTBAA(callOrInvoke, TBAAInfo);
     }
   }
   return Call;
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 35b5daaf6d4b55..9166db4c74128c 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -5932,7 +5932,8 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
                                  const CGCallee &OrigCallee, const CallExpr *E,
                                  ReturnValueSlot ReturnValue,
                                  llvm::Value *Chain,
-                                 llvm::CallBase **CallOrInvoke) {
+                                 llvm::CallBase **CallOrInvoke,
+                                 CGFunctionInfo const **ResolvedFnInfo) {
   // Get the actual function type. The callee type will always be a pointer to
   // function type or a block pointer type.
   assert(CalleeType->isFunctionPointerType() &&
@@ -6111,6 +6112,9 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
   const CGFunctionInfo &FnInfo = CGM.getTypes().arrangeFreeFunctionCall(
       Args, FnType, /*ChainCall=*/Chain);
 
+  if (ResolvedFnInfo)
+    *ResolvedFnInfo = &FnInfo;
+
   // C99 6.5.2.2p6:
   //   If the expression that denotes the called function has a type
   //   that does not include a prototype, [the default argument
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 9f868c98458996..1755dab3e3e4e8 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -4396,7 +4396,8 @@ class CodeGenFunction : public CodeGenTypeCache {
   }
   RValue EmitCall(QualType FnType, const CGCallee &Callee, const CallExpr *E,
                   ReturnValueSlot ReturnValue, llvm::Value *Chain = nullptr,
-                  llvm::CallBase **CallOrInvoke = nullptr);
+                  llvm::CallBase **CallOrInvoke = nullptr,
+                  CGFunctionInfo const **ResolvedFnInfo = nullptr);
 
   // If a Call or Invoke instruction was emitted for this CallExpr, this method
   // writes the pointer to `CallOrInvoke` if it's not null.
diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
new file mode 100644
index 00000000000000..56c6b3ec00bc7e
--- /dev/null
+++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
@@ -0,0 +1,36 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK
+
+long double powl(long double a, long double b);
+
+// Negative test: powl is a floating-point math function that is
+// ConstWithoutErrnoAndExceptions, however, for this target long doubles are
+// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata
+// will cause the setup for the BYVAL arguments to be incorrectly optimized out.
+
+// CHECK-LABEL: define dso_local void @test_powl(
+// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[TMP:%.*]] = alloca x86_fp80, align 16
+// CHECK-NEXT:    [[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16
+// CHECK-NEXT:    [[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16
+// CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]]
+// CHECK-NEXT:    store x86_fp80 0xK40008000000000000000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]]
+// CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
+// CHECK-NEXT:    store x86_fp80 0xK40008000000000000000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]]
+// CHECK-NEXT:    call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
+// CHECK-NEXT:    [[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA3]]
+// CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]]
+// CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
+// CHECK-NEXT:    store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]]
+// CHECK-NEXT:    ret void
+//
+long double test_powl() {
+   return powl(2.0L, 2.0L); // Don't emit TBAA metadata
+}
+//.
+// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0}
+// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0}
+// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0}
+// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"}
+//.

@MacDue MacDue requested review from mstorsjo and vfdff September 16, 2024 16:35
@MacDue
Copy link
Member Author

MacDue commented Sep 16, 2024

Note the first commit (6db9f6d) shows a correctness issue with what's currently upstream as the load x86_fp80 is incorrectly marked with "int" TBAA metadata (overwriting its metadata for long double).

@mstorsjo
Copy link
Member

Thanks for the fix! Unfortunately I'm way out of my depth when it comes to actually reviewing this change, so hopefully you can find someone else to check/review it - but I appreciate you finding the root cause so quickly!

@efriedma-quic
Copy link
Collaborator

How does this interact with #107598?

@MacDue
Copy link
Member Author

MacDue commented Sep 16, 2024

How does this interact with #107598?

I think it's solving the same problem, but a different way (looking at the final LLVM function type rather than checking the ABI information). Though this also changes things to ensure when TBAA data is set, it's always set on the call.

@zahiraam
Copy link
Contributor

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

@MacDue
Copy link
Member Author

MacDue commented Sep 17, 2024

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

@zahiraam
Copy link
Contributor

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

Without this patch and without (#107598) the function pow doesn't generate int TBAA info on the call, but it does on a call to cargl with -triple aarch64-unknown-unknown.

# | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2

@MacDue
Copy link
Member Author

MacDue commented Sep 17, 2024

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

Without this patch and without (#107598) the function pow doesn't generate int TBAA info on the call, but it does on a call to cargl with -triple aarch64-unknown-unknown.

# | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2

I'm not sure I follow, but the issue I spotted was "int" TBAA metadata was being set on the load following the pow call, but it has the same root cause as the cargl issue.

See the diff from the first commit:

diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
index dd013dcc8b3ca8..56c6b3ec00bc7e 100644
--- a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
+++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
@@ -19,7 +19,7 @@ long double powl(long double a, long double b);
 // CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
 // CHECK-NEXT:    store x86_fp80 0xK40008000000000000000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]]
 // CHECK-NEXT:    call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
-// CHECK-NEXT:    [[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]]
+// CHECK-NEXT:    [[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA3]]
 // CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]]
 // CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
 // CHECK-NEXT:    store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]]
@@ -33,6 +33,4 @@ long double test_powl() {
 // CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0}
 // CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0}
 // CHECK: [[META6]] = !{!"Simple C/C++ TBAA"}
-// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0}
-// CHECK: [[META8]] = !{!"int", [[META5]], i64 0}
 //.

but it does on a call to cargl with -triple aarch64-unknown-unknown.
> # | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2`

That looks okay though? It's not passing or returning values via pointers, so it should be safe to set the "int" TBAA (which indicates the only pointer it could read is errno).

@zahiraam
Copy link
Contributor

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

How does this interact with #107598?

Though this also changes things to ensure when TBAA data is set, it's always set on the call.

Wasn't already doing that? (setting the TBAA on the call?).

It was setting it on cast<llvm::Instruction>(Call.getScalarVal()) not necessarily the call (which you can get via an output on EmitCall()). At least in this case that meant it was putting the TBAA metadata on the load x86_fp80 after the call. I'm not sure if there's other cases where something similar could happen.

Without this patch and without (#107598) the function pow doesn't generate int TBAA info on the call, but it does on a call to cargl with -triple aarch64-unknown-unknown.
# | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2

I'm not sure I follow, but the issue I spotted was "int" TBAA metadata was being set on the load following the pow call, but it has the same root cause as the cargl issue.

See the diff from the first commit:

diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
index dd013dcc8b3ca8..56c6b3ec00bc7e 100644
--- a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
+++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c
@@ -19,7 +19,7 @@ long double powl(long double a, long double b);
 // CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
 // CHECK-NEXT:    store x86_fp80 0xK40008000000000000000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]]
 // CHECK-NEXT:    call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
-// CHECK-NEXT:    [[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]]
+// CHECK-NEXT:    [[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA3]]
 // CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]]
 // CHECK-NEXT:    call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]]
 // CHECK-NEXT:    store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]]
@@ -33,6 +33,4 @@ long double test_powl() {
 // CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0}
 // CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0}
 // CHECK: [[META6]] = !{!"Simple C/C++ TBAA"}
-// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0}
-// CHECK: [[META8]] = !{!"int", [[META5]], i64 0}
 //.

but it does on a call to cargl with -triple aarch64-unknown-unknown.
> # | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2`

That looks okay though? It's not passing or returning values via pointers, so it should be safe to set the "int" TBAA (which indicates the only pointer it could read is errno).

Not really! int TBAA in in our downstream compiler is interpreted as describing the function arguments (they are not int) and the load/store of the argument before the library call are begin eliminated which result in unexpected behavior.
I am not objecting to anything; I am just wondering what difference it makes to have the TBAA attached to the call instead of how it is now.

@MacDue
Copy link
Member Author

MacDue commented Sep 17, 2024

Not really! int TBAA in in our downstream compiler is interpreted as describing the function arguments (they are not int) and the load/store of the argument before the library call are begin eliminated which result in unexpected behavior.
I am not objecting to anything; I am just wondering what difference it makes to have the TBAA attached to the call instead of how it is now.

I believe the intention of both #96025 and #100302 was to set the metadata on the call. The case I mentioned where it was attached to the load after the call just seemed like an unintended consequence of attaching the metadata to the result value.

It sounds like both those changes break your downstream compiler?

My understanding is "int" TBAA metadata is set on floating-point math calls (that don't take pointers) to indicate the only memory they could read/write to is errno. I think this was to fix some unnecessary float value reloads around calls.

@MacDue
Copy link
Member Author

MacDue commented Sep 17, 2024

Not really! int TBAA in in our downstream compiler is interpreted as describing the function arguments (they are not int) and the load/store of the argument before the library call are begin eliminated which result in unexpected behavior.

Oh wait, then you say "not really" you're only referring to the case where the arguments are passed indirectly (i.e via pointers) not the case where they're passed via floating-point registers.

This patch is updating the check to look at the ABI information and prevents setting the "int" TBAA metadata on calls that pass values via pointers.

@zahiraam
Copy link
Contributor

Not really! int TBAA in in our downstream compiler is interpreted as describing the function arguments (they are not int) and the load/store of the argument before the library call are begin eliminated which result in unexpected behavior.

Oh wait, then you say "not really" you're only referring to the case where the arguments are passed indirectly (i.e via pointers) not the case where they're passed via floating-point registers.

Yes exactly! Have you tried running the test case in patch #107598 to see what IR it generates?

@MacDue
Copy link
Member Author

MacDue commented Sep 17, 2024

Yes exactly! Have you tried running the test case in patch #107598 to see what IR it generates?

I've updated my test (based on yours), and I think you can now see the TBAA "int" metadata is only set on the call when the arguments are not being passed via pointers.

@MacDue MacDue force-pushed the fix_tbaa_2 branch 2 times, most recently from c52d6c9 to f5d293c Compare September 18, 2024 12:05
@zahiraam
Copy link
Contributor

OK. This LGTM. I guess we can adopt this PR instead of #107598 unless other reviewers have more to say.

Copy link
Contributor

@vfdff vfdff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for help fix the issue

@MacDue
Copy link
Member Author

MacDue commented Sep 24, 2024

I'll land this later today if there's no further comments 🙂

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but like I mentioned on #107598, it would be good if there was a test that requires the argument check, and the return check isn't sufficient

On some targets, an FP libcall with argument types such as long double
will be lowered to pass arguments indirectly via pointers. When this is
the case we should not mark the libcall with "int" TBAA as it may lead
to incorrect optimizations.

Currently, this can be seen for long doubles on x86_64-w64-mingw32. The
`load x86_fp80` after the call is (incorrectly) marked with "int" TBAA
(overwriting the previous metadata for "long double").

Nothing seems to break due to this currently as the metadata is being
incorrectly placed on the load and not the call. But if the metadata
is moved to the call (which this patch ensures), LLVM will optimize out
the setup for the arguments.
@MacDue
Copy link
Member Author

MacDue commented Sep 24, 2024

LGTM, but like I mentioned on #107598, it would be good if there was a test that requires the argument check, and the return check isn't sufficient

I've added a test case for int ilogbl(long double a); (which tests this in the MINGW32 case 👍

@MacDue MacDue merged commit 53907ed into llvm:main Sep 25, 2024
@MacDue MacDue deleted the fix_tbaa_2 branch September 25, 2024 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants