Skip to content

[clang] Fix pointer comparisons between pointers to constexpr-unknown #147663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 15, 2025

Conversation

efriedma-quic
Copy link
Collaborator

A constexpr-unknown reference can be equal to an arbitrary value, except values allocated during constant evaluation. Fix the handling.

The standard is unclear exactly which pointer comparisons count as "unknown" in this context; for example, in some cases we could use alignment to prove two constexpr-unknown references are not equal. I decided to ignore all the cases involving variables not allocated during constant evaluation.

While looking at this, I also spotted that there might be issues with lifetimes, but I didn't try to address it.

A constexpr-unknown reference can be equal to an arbitrary value, except
values allocated during constant evaluation. Fix the handling.

The standard is unclear exactly which pointer comparisons count as
"unknown" in this context; for example, in some cases we could use
alignment to prove two constexpr-unknown references are not equal. I
decided to ignore all the cases involving variables not allocated during
constant evaluation.

While looking at this, I also spotted that there might be issues with
lifetimes, but I didn't try to address it.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jul 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-clang

Author: Eli Friedman (efriedma-quic)

Changes

A constexpr-unknown reference can be equal to an arbitrary value, except values allocated during constant evaluation. Fix the handling.

The standard is unclear exactly which pointer comparisons count as "unknown" in this context; for example, in some cases we could use alignment to prove two constexpr-unknown references are not equal. I decided to ignore all the cases involving variables not allocated during constant evaluation.

While looking at this, I also spotted that there might be issues with lifetimes, but I didn't try to address it.


Full diff: https://github.com/llvm/llvm-project/pull/147663.diff

2 Files Affected:

  • (modified) clang/lib/AST/ExprConstant.cpp (+17-6)
  • (modified) clang/test/SemaCXX/constant-expression-p2280r4.cpp (+23-8)
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 60c658a8d8f99..6ade7e6ec8a6a 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -14477,12 +14477,6 @@ EvaluateComparisonBinaryOperator(EvalInfo &Info, const BinaryOperator *E,
     if (!EvaluatePointer(E->getRHS(), RHSValue, Info) || !LHSOK)
       return false;
 
-    // If we have Unknown pointers we should fail if they are not global values.
-    if (!(IsGlobalLValue(LHSValue.getLValueBase()) &&
-          IsGlobalLValue(RHSValue.getLValueBase())) &&
-        (LHSValue.AllowConstexprUnknown || RHSValue.AllowConstexprUnknown))
-      return false;
-
     // Reject differing bases from the normal codepath; we special-case
     // comparisons to null.
     if (!HasSameBase(LHSValue, RHSValue)) {
@@ -14544,6 +14538,23 @@ EvaluateComparisonBinaryOperator(EvalInfo &Info, const BinaryOperator *E,
           (LHSValue.Base && isZeroSized(RHSValue)))
         return DiagComparison(
             diag::note_constexpr_pointer_comparison_zero_sized);
+      // A constexpr-unknown reference can be equal to any other lvalue, except
+      // for variables allocated during constant evaluation. (The "lifetime
+      // [...] includes the entire constant evaluation", so it has to be
+      // distinct from anything allocated during constant evaluation.)
+      //
+      // Theoretically we could handle other cases, but the standard doesn't say
+      // what other cases we need to handle; it just says an "equality
+      // operator where the result is unspecified" isn't a constant expression.
+      auto AllocatedDuringEval = [](LValue &Value) {
+        return Value.Base.is<DynamicAllocLValue>() ||
+               Value.getLValueCallIndex();
+      };
+      if ((LHSValue.AllowConstexprUnknown && !AllocatedDuringEval(RHSValue)) ||
+          (RHSValue.AllowConstexprUnknown && !AllocatedDuringEval(LHSValue)))
+        return DiagComparison(
+            diag::note_constexpr_pointer_comparison_unspecified);
+      // FIXME: Verify both variables are live.
       return Success(CmpResult::Unequal, E);
     }
 
diff --git a/clang/test/SemaCXX/constant-expression-p2280r4.cpp b/clang/test/SemaCXX/constant-expression-p2280r4.cpp
index dffb386f530f4..640ac18aad738 100644
--- a/clang/test/SemaCXX/constant-expression-p2280r4.cpp
+++ b/clang/test/SemaCXX/constant-expression-p2280r4.cpp
@@ -319,7 +319,7 @@ namespace casting {
 }
 
 namespace pointer_comparisons {
-  extern int &extern_n; // interpreter-note 2 {{declared here}}
+  extern int &extern_n; // interpreter-note 4 {{declared here}}
   extern int &extern_n2;
   constexpr int f1(bool b, int& n) {
     if (b) {
@@ -330,14 +330,29 @@ namespace pointer_comparisons {
   // FIXME: interpreter incorrectly rejects; both sides are the same constexpr-unknown value.
   static_assert(f1(false, extern_n)); // interpreter-error {{static assertion expression is not an integral constant expression}} \
                                       // interpreter-note {{initializer of 'extern_n' is unknown}}
-  // FIXME: We should diagnose this: we don't know if the references bind
-  // to the same object.
-  static_assert(&extern_n != &extern_n2); // interpreter-error {{static assertion expression is not an integral constant expression}} \
+  static_assert(&extern_n != &extern_n2); // expected-error {{static assertion expression is not an integral constant expression}} \
+                                          // nointerpreter-note {{comparison between pointers to unrelated objects '&extern_n' and '&extern_n2' has unspecified value}} \
                                           // interpreter-note {{initializer of 'extern_n' is unknown}}
   void f2(const int &n) {
-    // FIXME: We should not diagnose this: the two objects provably have
-    // different addresses because the lifetime of "n" extends across
-    // the initialization.
-    constexpr int x = &x == &n; // nointerpreter-error {{must be initialized by a constant expression}}
+    // We can prove these two aren't equal, but for now we don't try.
+    constexpr int x = &x == &n; // nointerpreter-error {{must be initialized by a constant expression}} \
+                                // nointerpreter-note {{comparison between pointers to unrelated objects '&x' and '&n' has unspecified value}}
+    // Distinct variables are not equal, even if they're local variables.
+    constexpr int y = &x == &y;
+    static_assert(!y);
   }
+  constexpr int f3() {
+    int x;
+    return &x == &extern_n; // interpreter-note {{initializer of 'extern_n' is unknown}}
+  }
+  static_assert(!f3()); // interpreter-error {{static assertion expression is not an integral constant expression}} \
+                        // interpreter-note {{in call to 'f3()'}}
+  constexpr int f4() {
+    int *p = new int;
+    bool b = p == &extern_n; // interpreter-note {{initializer of 'extern_n' is unknown}}
+    delete p;
+    return b;
+  }
+  static_assert(!f4()); // interpreter-error {{static assertion expression is not an integral constant expression}} \
+                        // interpreter-note {{in call to 'f4()'}}
 }

//
// Theoretically we could handle other cases, but the standard doesn't say
// what other cases we need to handle; it just says an "equality
// operator where the result is unspecified" isn't a constant expression.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we know two pointers point into same object, pointer equality is trivial. When we know two pointers point into different objects, we can follow the rules you outline.

The problem is if we don't know whether the two pointers point into different objects. Then weird things start happening, and you have to start digging into other sections of the standard. Like, consider:

void f() {
  void* x = malloc(4);
  int* ip = new(x)ip;
  int & ir = *ip;
  float *fp = new(x)fp;
  float &fr = *fp;
  static_assert(&ir != &fr)
}

If you take the rules literally, the static_assert is valid, and succeeds. "the reference is treated as binding to an unspecified object of the referenced type whose lifetime and that of all subobjects includes the entire constant evaluation"... and a given object can only have one type, so the two objects must be distinct, so the pointers can't be equal.

It would help if the standard said that an equality comparison is not a core constant expression if it contains "an equality comparison where one of the operands points into a constexpr-unknown object".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there are "constexpr-unknown objects" in the standard -- in this case we instead get an "unspecified object":

For such a reference that is not usable in constant expressions, the reference is treated as binding to an unspecified object of the referenced type whose lifetime and that of all subobjects includes the entire constant evaluation and whose dynamic type is constexpr-unknown.

And I think, because the object is unspecified, that means that it's implicitly unspecified whether it's the same object as another object -- or more specifically, whether pointers to the two objects represent the same address, and hence implicitly unspecified whether the pointers compare equal. (In this case, they could both be member subobjects of the same union, for example.) I agree it'd be nice if that were more explicit, but I think the choice of wording here ("unspecified object") was intended to trigger the "comparison whose result is unspecified" rule.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says the lifetime of the object includes the entire constant evaluation. And if you have a live object of type int, and a live object of type float, they can't overlap. (If you have an int and a float in a union, they can't both be live at the same time.)

Maybe the intent is that the unspecified object doesn't actually have to be a live object, just refer to allocated storage? I guess that would eliminate the weirdest edge cases/inconsistencies. And I guess that would mean a constexpr-unknown object can overlap with an object allocated during constant evaluation.

We would have to check alignment, though; if we can prove the low bits of two pointers are not equal, the pointers themselves can't be equal:

struct alignas(int) F { char x[4]; };
extern F &e1, &e2;
static_assert(&e1.x[1] != &e2.x[2]);

And... I can't think of anything else we need to check off the top of my head, but there might be other edge cases I'm not thinking of. It would be nicer if the standard just allowed us to reject everything involving equality of constexpr-unknown.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWG had a discussion of this relatively recently, starting here: http://lists.isocpp.org/core/2025/05/18115.php (apologies to those without access). The current informal consensus there is to treat any comparison involving a pointer to an object with constexpr-unknown dynamic type as non-constant. There was a suggestion towards the end of the discussion to allow more cases for situations where we can prove the pointers are definitely unequal, and if we have concrete suggestions of cases we think should be valid I could certainly convey them.

I would note that it's not correct to say that the address of an object whose lifetime started in the current evaluation must compare unequal to the address of an unspecified object:

extern struct X &ref;
struct X {
  constexpr X() : b(this == &ref) {}
  bool b;
};
X x;
constexpr X &ref = x;

(We currently seem to miscompile this example in C++23 mode onwards.) The best outcome here is presumably for the initialization of x to be non-constant, so that at runtime we can initialize x.b to true.

I think we can say that an object whose lifetime is known to end in the current evaluation (eg, a local variable of a constexpr function or a non-lifetime-extended temporary) has a different address from an unknown object. But it's probably not OK from a forward-looking perspective to extend that to compile-time heap allocations, since people are working on allowing such heap allocations to remain alive into the program execution in various ways.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another interesting case is for comparisons involving pointers that we symbolically know to be equal (or more broadly, know to be based on the same unknown object) despite being unknown. The current CWG direction seems to suggest that we would reject even those comparisons, but that seems too strict to me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I removed the checks for objects from the current evaluation, and my comment about the standard.

I'd like to get this in because it's fixing a clear miscompile of pointer comparisons; we can worry about the finer details in a followup once the CWG decides what to do.

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though please wait a day or two in case @shafik or others have comments.

@efriedma-quic efriedma-quic merged commit 20c8e3c into llvm:main Jul 15, 2025
10 checks passed
@efriedma-quic efriedma-quic added this to the LLVM 21.x Release milestone Jul 15, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Jul 15, 2025
@efriedma-quic
Copy link
Collaborator Author

/cherry-pick 20c8e3c

1 similar comment
@efriedma-quic
Copy link
Collaborator Author

/cherry-pick 20c8e3c

@llvmbot
Copy link
Member

llvmbot commented Jul 15, 2025

/pull-request #148907

@llvmbot llvmbot moved this from Needs Triage to Done in LLVM Release Status Jul 15, 2025
tru pushed a commit to llvmbot/llvm-project that referenced this pull request Jul 17, 2025
…llvm#147663)

A constexpr-unknown reference can be equal to an arbitrary value, except
values allocated during constant evaluation. Fix the handling.

The standard is unclear exactly which pointer comparisons count as
"unknown" in this context; for example, in some cases we could use
alignment to prove two constexpr-unknown references are not equal. I
decided to ignore all the cases involving variables not allocated during
constant evaluation.

While looking at this, I also spotted that there might be issues with
lifetimes, but I didn't try to address it.

(cherry picked from commit 20c8e3c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

4 participants