Skip to content

[clang] Compute accurate begin location for CallExpr with explicit object parameter #117841

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 6, 2024

Conversation

HighCommander4
Copy link
Collaborator

The explicit object parameter is written before the callee expression, so the begin location should come from the explicit object parameter.

Fixes #116335

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 27, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 27, 2024

@llvm/pr-subscribers-clang-tools-extra

@llvm/pr-subscribers-clang

Author: Nathan Ridge (HighCommander4)

Changes

The explicit object parameter is written before the callee expression, so the begin location should come from the explicit object parameter.

Fixes #116335


Full diff: https://github.com/llvm/llvm-project/pull/117841.diff

3 Files Affected:

  • (modified) clang/lib/AST/Expr.cpp (+7)
  • (modified) clang/lib/Sema/SemaOverload.cpp (+1-1)
  • (modified) clang/test/AST/ast-dump-cxx2b-deducing-this.cpp (+2-2)
diff --git a/clang/lib/AST/Expr.cpp b/clang/lib/AST/Expr.cpp
index a4fb4d5a1f2ec4..8b10a2609c2ab0 100644
--- a/clang/lib/AST/Expr.cpp
+++ b/clang/lib/AST/Expr.cpp
@@ -1639,6 +1639,13 @@ SourceLocation CallExpr::getBeginLoc() const {
   if (const auto *OCE = dyn_cast<CXXOperatorCallExpr>(this))
     return OCE->getBeginLoc();
 
+  if (const CXXMethodDecl *Method =
+          dyn_cast_or_null<const CXXMethodDecl>(getCalleeDecl());
+      Method && Method->isExplicitObjectMemberFunction()) {
+    assert(getNumArgs() > 0 && getArg(0));
+    return getArg(0)->getBeginLoc();
+  }
+
   SourceLocation begin = getCallee()->getBeginLoc();
   if (begin.isInvalid() && getNumArgs() > 0 && getArg(0))
     begin = getArg(0)->getBeginLoc();
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 4c9e37bd286dee..e4bf9aa521224b 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -15565,7 +15565,7 @@ ExprResult Sema::BuildCallToMemberFunction(Scope *S, Expr *MemExprE,
     // Build the actual expression node.
     ExprResult FnExpr =
         CreateFunctionRefExpr(*this, Method, FoundDecl, MemExpr,
-                              HadMultipleCandidates, MemExpr->getBeginLoc());
+                              HadMultipleCandidates, MemExpr->getExprLoc());
     if (FnExpr.isInvalid())
       return ExprError();
 
diff --git a/clang/test/AST/ast-dump-cxx2b-deducing-this.cpp b/clang/test/AST/ast-dump-cxx2b-deducing-this.cpp
index 04cff07376885a..1b385e0fc33319 100644
--- a/clang/test/AST/ast-dump-cxx2b-deducing-this.cpp
+++ b/clang/test/AST/ast-dump-cxx2b-deducing-this.cpp
@@ -9,7 +9,7 @@ int main() {
   S s;
   int x = s.f();
   // CHECK: CallExpr 0x{{[^ ]*}} <col:11, col:15> 'int
-  // CHECK-NEXT: |-ImplicitCastExpr 0x{{[^ ]*}} <col:11> 'int (*)(S &)' <FunctionToPointerDecay>
-  // CHECK-NEXT: | `-DeclRefExpr 0x{{[^ ]*}} <col:11> 'int (S &)' lvalue CXXMethod 0x{{[^ ]*}} 'f' 'int (S &)'
+  // CHECK-NEXT: |-ImplicitCastExpr 0x{{[^ ]*}} <col:13> 'int (*)(S &)' <FunctionToPointerDecay>
+  // CHECK-NEXT: | `-DeclRefExpr 0x{{[^ ]*}} <col:13> 'int (S &)' lvalue CXXMethod 0x{{[^ ]*}} 'f' 'int (S &)'
 }
 }

@@ -1639,6 +1639,13 @@ SourceLocation CallExpr::getBeginLoc() const {
if (const auto *OCE = dyn_cast<CXXOperatorCallExpr>(this))
return OCE->getBeginLoc();

if (const CXXMethodDecl *Method =
dyn_cast_or_null<const CXXMethodDecl>(getCalleeDecl());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dyn_cast_or_null<const CXXMethodDecl>(getCalleeDecl());
dyn_cast_if_present<const CXXMethodDecl>(getCalleeDecl());

Copy link
Contributor

@zyn0217 zyn0217 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this also needs a selection tree test to demonstrate the original case works with the current approach.

Besides, If we can't highlight the loci differences (compared to the behavior before the last PR) using simple -ast-dump flags, then perhaps we should reconsider the necessity of clang/test/AST/ast-dump-cxx2b-deducing-this.cpp - removing it might be a good option.

That said, I was still thinking if there's any possibility for us to find a better AST model for such use cases, rather than relying on special-casing it in CallExpr::getBeginLoc

@zyn0217 zyn0217 requested a review from cor3ntin November 27, 2024 07:23
@HighCommander4
Copy link
Collaborator Author

That said, I was still thinking if there's any possibility for us to find a better AST model for such use cases, rather than relying on special-casing it in CallExpr::getBeginLoc

A more general implementation of CallExpr::getBeginLoc that occurs to me is: return the earlier of the callee's begin loc and the first argument's (if present) begin loc.

@@ -1639,6 +1639,13 @@ SourceLocation CallExpr::getBeginLoc() const {
if (const auto *OCE = dyn_cast<CXXOperatorCallExpr>(this))
return OCE->getBeginLoc();

if (const CXXMethodDecl *Method =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (const CXXMethodDecl *Method =
if (const auto *Method =

@HighCommander4
Copy link
Collaborator Author

Updated to add a clangd test that fails with the previous fix approach.

A more general implementation of CallExpr::getBeginLoc that occurs to me is: return the earlier of the callee's begin loc and the first argument's (if present) begin loc.

I did not do this because accurately determining which of two SourceLocations is earlier requires calling SourceManager::isBeforeInTranslationUnit(), and a SourceManager does not seem to be available in CallExpr::getBeginLoc().

Copy link

github-actions bot commented Nov 29, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

…ject parameter

The explicit object parameter is written before the callee expression,
so the begin location should come from the explicit object parameter.
@HighCommander4 HighCommander4 force-pushed the llvm-issue-116335-review branch from 367a0cc to 3e5e55f Compare November 29, 2024 05:09
@HighCommander4
Copy link
Collaborator Author

(Removed extraneous formatting changes)

Copy link
Contributor

@zyn0217 zyn0217 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, while I haven’t come up with a better solution to model this situation, I think this looks good for now.

Please give @cor3ntin some time before merging, in case there are some better approaches that I might have overlooked.

@@ -1639,11 +1639,19 @@ SourceLocation CallExpr::getBeginLoc() const {
if (const auto *OCE = dyn_cast<CXXOperatorCallExpr>(this))
return OCE->getBeginLoc();

if (const auto *Method =
dyn_cast_if_present<const CXXMethodDecl>(getCalleeDecl());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I presume the const in <const CXXMethodDecl> isn't necessary, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could do something like

unsigned Offset = 0;
if(isExplicitObjectMemberFunction())
    Offset = 1;

To avoid some duplication

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not quite following this suggestion. What am I supposed to do with Offset?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh I think there is some opportunities to merge the slightly different if statements (for the explicit and not explicit cases) but it would be a minor improvement so I'll will approve that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh I think there is some opportunities to merge the slightly different if statements (for the explicit and not explicit cases) but it would be a minor improvement so I'll will approve that

Ah, I see.

I considered doing something like this:

bool UseFirstArgLoc = false;
if (/* is explicit object */)
  UseFirstArgLoc = true;

SourceLocation begin = getCallee()->getBeginLoc();
if ((begin.isInvalid() || UseFirstArgLoc) && getNumArgs() > 0 && getArg(0))
  begin = getArg(0)->getBeginLoc();
return begin;

But this needlessly evaluates getCallee()->getBeginLoc() in the explicit case, so I opted not to do this.

@HighCommander4 HighCommander4 merged commit 9ccde12 into llvm:main Dec 6, 2024
8 checks passed
@nikic
Copy link
Contributor

nikic commented Dec 6, 2024

@HighCommander4
Copy link
Collaborator Author

It looks like this causes a significant compile-time regression: https://llvm-compile-time-tracker.com/compare.php?from=2b855dd97092e2178ac5c470a804a17ec440d7e5&to=9ccde12f5eeb91152900082a2ae839e2a9702b31&stat=instructions:u (Maybe most clearly seen during clang bootstrap, where this adds 0.5% to many compilations: https://llvm-compile-time-tracker.com/compare_clang.php?from=2b855dd97092e2178ac5c470a804a17ec440d7e5&to=9ccde12f5eeb91152900082a2ae839e2a9702b31&stat=instructions%3Au&sortBy=interestingness)

Is that expected?

I think it's at least conceivable, as the patch adds an extra branch to CallExpr::getBeginLoc(), which is likely to be called pretty frequently.

The branch is not taken most of the time (it's only taken for calls to functions with an explicit object argument, which is a C++23 feature), so perhaps annotating the branch as "unlikely" is sufficient to avoid the performance regression?

@HighCommander4
Copy link
Collaborator Author

The branch is not taken most of the time (it's only taken for calls to functions with an explicit object argument, which is a C++23 feature), so perhaps annotating the branch as "unlikely" is sufficient to avoid the performance regression?

Ah, no, that's not sufficient because it takes some work (a call to getCalleeDecl()) to compute the branch condition.

We could consider revising the implementation approach to optimize this better; for example, add a "uses explicit object argument" bit to CallExprBits?

@tbaederr
Copy link
Contributor

tbaederr commented Dec 8, 2024

I didn't check if any tests fail but here's a version where CallExpr saves its BeginLoc explicitly: http://llvm-compile-time-tracker.com/compare.php?from=416e4cd332c7421b187844ac9aaf6fe28b575a7d&to=0b6e36fe460409aa59958b79766b4f64a31c97e6&stat=instructions:u

@HighCommander4
Copy link
Collaborator Author

I didn't check if any tests fail but here's a version where CallExpr saves its BeginLoc explicitly: http://llvm-compile-time-tracker.com/compare.php?from=416e4cd332c7421b187844ac9aaf6fe28b575a7d&to=0b6e36fe460409aa59958b79766b4f64a31c97e6&stat=instructions:u

Yeah, this is another alternative, though I wonder if the memory tradeoff it makes (increasing the size of CallExpr by 4 bytes) is worth it.

@shafik
Copy link
Collaborator

shafik commented Mar 7, 2025

This PR was flagged as the cause of this crash: #130272

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category clang-tools-extra clangd
Projects
Status: Needs Triage
Development

Successfully merging this pull request may close these issues.

[clangd] Incorrect navigation caused by deducing this
9 participants