Skip to content

Conversation

marco-antognini-sonarsource
Copy link
Contributor

Reverts #115917 and its follow up #116840.
Fixes #153782 and introduces regression tests.
Reopens #114270.

…#116840)"

This reverts commit 9b2ec87.

Resolved conflicts in:
	clang/lib/StaticAnalyzer/Core/RegionStore.cpp
	clang/test/Analysis/ctor-trivial-copy.cpp
	clang/test/Analysis/explain-svals.cpp
…s (2/4) (llvm#115917)"

This reverts commit 4610e5c.

Resolved conflicts in:
	clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 14, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Marco Borgeaud (marco-antognini-sonarsource)

Changes

Reverts #115917 and its follow up #116840.
Fixes #153782 and introduces regression tests.
Reopens #114270.


Full diff: https://github.com/llvm/llvm-project/pull/163461.diff

10 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (-48)
  • (modified) clang/test/Analysis/NewDelete-checker-test.cpp (+76-4)
  • (modified) clang/test/Analysis/ctor-trivial-copy.cpp (+47-15)
  • (modified) clang/test/Analysis/explain-svals.cpp (+1-1)
  • (modified) clang/test/Analysis/iterator-modeling.cpp (-2)
  • (modified) clang/test/Analysis/stl-algorithm-modeling-aggressive-std-find-modeling.cpp (-10)
  • (modified) clang/test/Analysis/stl-algorithm-modeling.cpp (-10)
  • (modified) clang/test/Analysis/store-dump-orders.cpp (+1-1)
  • (modified) clang/test/Analysis/taint-generic.cpp (+5-1)
  • (modified) clang/test/Analysis/template-param-objects.cpp (+1-1)
diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index af0ef52334bd7..5221a3248e46e 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -714,11 +714,6 @@ class RegionStoreManager : public StoreManager {
     return getBinding(getRegionBindings(S), L, T);
   }
 
-  std::optional<SVal> getUniqueDefaultBinding(RegionBindingsConstRef B,
-                                              const TypedValueRegion *R) const;
-  std::optional<SVal>
-  getUniqueDefaultBinding(nonloc::LazyCompoundVal LCV) const;
-
   std::optional<SVal> getDefaultBinding(Store S, const MemRegion *R) override {
     RegionBindingsRef B = getRegionBindings(S);
     // Default bindings are always applied over a base region so look up the
@@ -2465,11 +2460,6 @@ SVal RegionStoreManager::getBindingForStruct(RegionBindingsConstRef B,
   // behavior doesn't depend on the struct layout.
   // This way even an empty struct can carry taint, no matter if creduce drops
   // the last field member or not.
-
-  // Try to avoid creating a LCV if it would anyways just refer to a single
-  // default binding.
-  if (std::optional<SVal> Val = getUniqueDefaultBinding(B, R))
-    return *Val;
   return createLazyBinding(B, R);
 }
 
@@ -2757,50 +2747,12 @@ RegionStoreManager::bindVector(LimitedRegionBindingsConstRef B,
   return NewB;
 }
 
-std::optional<SVal>
-RegionStoreManager::getUniqueDefaultBinding(RegionBindingsConstRef B,
-                                            const TypedValueRegion *R) const {
-  if (R != R->getBaseRegion())
-    return std::nullopt;
-
-  const auto *Cluster = B.lookup(R);
-  if (!Cluster || !llvm::hasSingleElement(*Cluster))
-    return std::nullopt;
-
-  const auto [Key, Value] = *Cluster->begin();
-  return Key.isDirect() ? std::optional<SVal>{} : Value;
-}
-
-std::optional<SVal>
-RegionStoreManager::getUniqueDefaultBinding(nonloc::LazyCompoundVal LCV) const {
-  auto B = getRegionBindings(LCV.getStore());
-  return getUniqueDefaultBinding(B, LCV.getRegion());
-}
-
 std::optional<LimitedRegionBindingsRef> RegionStoreManager::tryBindSmallStruct(
     LimitedRegionBindingsConstRef B, const TypedValueRegion *R,
     const RecordDecl *RD, nonloc::LazyCompoundVal LCV) {
   if (B.hasExhaustedBindingLimit())
     return B.withValuesEscaped(LCV);
 
-  // If we try to copy a Conjured value representing the value of the whole
-  // struct, don't try to element-wise copy each field.
-  // That would unnecessarily bind Derived symbols slicing off the subregion for
-  // the field from the whole Conjured symbol.
-  //
-  //   struct Window { int width; int height; };
-  //   Window getWindow(); <-- opaque fn.
-  //   Window w = getWindow(); <-- conjures a new Window.
-  //   Window w2 = w; <-- trivial copy "w", calling "tryBindSmallStruct"
-  //
-  // We should not end up with a new Store for "w2" like this:
-  //   Direct [ 0..31]: Derived{Conj{}, w.width}
-  //   Direct [32..63]: Derived{Conj{}, w.height}
-  // Instead, we should just bind that Conjured value instead.
-  if (std::optional<SVal> Val = getUniqueDefaultBinding(LCV)) {
-    return B.addBinding(BindingKey::Make(R, BindingKey::Default), Val.value());
-  }
-
   FieldVector Fields;
 
   if (const CXXRecordDecl *Class = dyn_cast<CXXRecordDecl>(RD))
diff --git a/clang/test/Analysis/NewDelete-checker-test.cpp b/clang/test/Analysis/NewDelete-checker-test.cpp
index c417b9c2ac97e..2738ecc56ba7a 100644
--- a/clang/test/Analysis/NewDelete-checker-test.cpp
+++ b/clang/test/Analysis/NewDelete-checker-test.cpp
@@ -3,13 +3,13 @@
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDelete
 //
-// RUN: %clang_analyze_cc1 -DLEAKS -std=c++11 -fblocks %s \
+// RUN: %clang_analyze_cc1 -std=c++11 -fblocks %s \
 // RUN:   -verify=expected,newdelete,leak \
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDelete \
 // RUN:   -analyzer-checker=cplusplus.NewDeleteLeaks
 //
-// RUN: %clang_analyze_cc1 -std=c++11 -fblocks -verify %s \
+// RUN: %clang_analyze_cc1 -std=c++11 -fblocks %s \
 // RUN:   -verify=expected,leak \
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDeleteLeaks
@@ -19,13 +19,13 @@
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDelete
 //
-// RUN: %clang_analyze_cc1 -DLEAKS -std=c++17 -fblocks %s \
+// RUN: %clang_analyze_cc1 -std=c++17 -fblocks %s \
 // RUN:   -verify=expected,newdelete,leak \
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDelete \
 // RUN:   -analyzer-checker=cplusplus.NewDeleteLeaks
 //
-// RUN: %clang_analyze_cc1 -std=c++17 -fblocks -verify %s \
+// RUN: %clang_analyze_cc1 -std=c++17 -fblocks %s \
 // RUN:   -verify=expected,leak,inspection \
 // RUN:   -analyzer-checker=core \
 // RUN:   -analyzer-checker=cplusplus.NewDeleteLeaks \
@@ -503,3 +503,75 @@ namespace optional_union {
     custom_union_t a;
   } // leak-warning{{Potential leak of memory pointed to by 'a.present.q'}}
 }
+
+namespace gh153782 {
+
+// Ensure we do not regress on the following use case.
+
+namespace mutually_exclusive_test_case_1 {
+struct StorageWrapper {
+  // Imagine those two call a reset() function (among other things)
+  ~StorageWrapper() { delete parts; }
+  StorageWrapper(StorageWrapper const&) = default;
+
+  // Mind that there is no assignment here -- this is the bug we would like to find.
+  void operator=(StorageWrapper&&) { delete parts; } // newdelete-warning{{Attempt to release already released memory}}
+
+  // Not provided, typically would do `parts = new long`.
+  StorageWrapper();
+
+  long* parts;
+};
+
+void test_non_trivial_struct_assignment() {
+  StorageWrapper* object = new StorageWrapper[]{StorageWrapper()};
+  object[0] = StorageWrapper(); // This assignment leads to the double-free.
+}
+} // mutually_exclusive_test_case_1
+
+namespace mutually_exclusive_test_case_2 {
+struct StorageWrapper {
+  // Imagine those two call a reset() function (among other things)
+  ~StorageWrapper() { delete parts; }
+  StorageWrapper(StorageWrapper const&) = default;
+
+  // Mind that there is no assignment here -- this is the bug we would like to find.
+  void operator=(StorageWrapper&&) { delete parts; }
+
+  // Not provided, typically would do `parts = new long`.
+  StorageWrapper();
+
+  long* parts;
+};
+
+void test_non_trivial_struct_assignment() {
+  StorageWrapper* object = new StorageWrapper[]{StorageWrapper()};
+  // object[0] = StorageWrapper(); // Remove the source of double free to make the potential leak appear.
+} // leak-warning{{Potential leak of memory pointed to by 'object'}}
+} // mutually_exclusive_test_case_2
+
+namespace mutually_exclusive_test_case_3 {
+struct StorageWrapper {
+  // Imagine those two call a reset() function (among other things)
+  ~StorageWrapper() { delete parts; }
+  StorageWrapper(StorageWrapper const&) = default;
+
+  // Mind that there is no assignment here -- this is the bug we would like to find.
+  void operator=(StorageWrapper&&) { delete parts; } // newdelete-warning{{Attempt to release already released memory}}
+
+  // Not provided, typically would do `parts = new long`.
+  StorageWrapper();
+
+  long* parts;
+};
+
+struct TestDoubleFreeWithInitializerList {
+  StorageWrapper* Object;
+  TestDoubleFreeWithInitializerList()
+  : Object(new StorageWrapper[]{StorageWrapper()}) {
+    Object[0] = StorageWrapper(); // This assignment leads to the double-free.
+  }
+};
+} // mutually_exclusive_test_case_3
+
+} // namespace gh153782
diff --git a/clang/test/Analysis/ctor-trivial-copy.cpp b/clang/test/Analysis/ctor-trivial-copy.cpp
index 940ff9ba3ed9c..44990fc631d6d 100644
--- a/clang/test/Analysis/ctor-trivial-copy.cpp
+++ b/clang/test/Analysis/ctor-trivial-copy.cpp
@@ -5,8 +5,6 @@
 void clang_analyzer_printState();
 template <typename T> void clang_analyzer_dump_lref(T& param);
 template <typename T> void clang_analyzer_dump_val(T param);
-template <typename T> void clang_analyzer_denote(T param, const char *name);
-template <typename T> void clang_analyzer_express(T param);
 template <typename T> T conjure();
 template <typename... Ts> void nop(const Ts &... args) {}
 
@@ -42,10 +40,16 @@ void test_assign_return() {
 namespace trivial_struct_copy {
 
 void _01_empty_structs() {
-  clang_analyzer_dump_val(conjure<empty>()); // expected-warning {{conj_$}}
+  clang_analyzer_dump_val(conjure<empty>()); // expected-warning {{lazyCompoundVal}}
   empty Empty = conjure<empty>();
   empty Empty2 = Empty;
   empty Empty3 = Empty2;
+  // All of these should refer to the exact same LCV, because all of
+  // these trivial copies refer to the original conjured value.
+  // There were Unknown before:
+  clang_analyzer_dump_val(Empty);  // expected-warning {{lazyCompoundVal}}
+  clang_analyzer_dump_val(Empty2); // expected-warning {{lazyCompoundVal}}
+  clang_analyzer_dump_val(Empty3); // expected-warning {{lazyCompoundVal}}
 
   // We only have binding for the original Empty object, because copying empty
   // objects is a no-op in the performTrivialCopy. This is fine, because empty
@@ -67,20 +71,18 @@ void _01_empty_structs() {
 }
 
 void _02_structs_with_members() {
-  clang_analyzer_dump_val(conjure<aggr>()); // expected-warning {{conj_$}}
+  clang_analyzer_dump_val(conjure<aggr>()); // expected-warning {{lazyCompoundVal}}
   aggr Aggr = conjure<aggr>();
   aggr Aggr2 = Aggr;
   aggr Aggr3 = Aggr2;
-  // All of these should refer to the exact same symbol, because all of
+  // All of these should refer to the exact same LCV, because all of
   // these trivial copies refer to the original conjured value.
-  clang_analyzer_denote(Aggr, "$Aggr");
-  clang_analyzer_express(Aggr);  // expected-warning {{$Aggr}}
-  clang_analyzer_express(Aggr2); // expected-warning {{$Aggr}}
-  clang_analyzer_express(Aggr3); // expected-warning {{$Aggr}}
-
-  // We should have the same Conjured symbol for "Aggr", "Aggr2" and "Aggr3".
-  // We used to have Derived symbols for the individual fields that were
-  // copied as part of copying the whole struct.
+  clang_analyzer_dump_val(Aggr);  // expected-warning {{lazyCompoundVal}}
+  clang_analyzer_dump_val(Aggr2); // expected-warning {{lazyCompoundVal}}
+  clang_analyzer_dump_val(Aggr3); // expected-warning {{lazyCompoundVal}}
+
+  // We have fields in the struct we copy, thus we also have the entries for the copies
+  // (and for all of their fields).
   clang_analyzer_printState();
   // CHECK:       "store": { "pointer": "0x{{[0-9a-f]+}}", "items": [
   // CHECK-NEXT:    { "cluster": "GlobalInternalSpaceRegion", "pointer": "0x{{[0-9a-f]+}}", "items": [
@@ -93,10 +95,12 @@ void _02_structs_with_members() {
   // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "[[AGGR_CONJ:conj_\$[0-9]+{int, LC[0-9]+, S[0-9]+, #[0-9]+}]]" }
   // CHECK-NEXT:    ]},
   // CHECK-NEXT:    { "cluster": "Aggr2", "pointer": "0x{{[0-9a-f]+}}", "items": [
-  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "[[AGGR_CONJ]]" }
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 0, "value": "derived_${{[0-9]+}}{[[AGGR_CONJ]],Aggr.x}" },
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 32, "value": "derived_${{[0-9]+}}{[[AGGR_CONJ]],Aggr.y}" }
   // CHECK-NEXT:    ]},
   // CHECK-NEXT:    { "cluster": "Aggr3", "pointer": "0x{{[0-9a-f]+}}", "items": [
-  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "[[AGGR_CONJ]]" }
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 0, "value": "derived_${{[0-9]+}}{[[AGGR_CONJ]],Aggr.x}" },
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 32, "value": "derived_${{[0-9]+}}{[[AGGR_CONJ]],Aggr.y}" }
   // CHECK-NEXT:    ]}
   // CHECK-NEXT:  ]},
 
@@ -113,3 +117,31 @@ void entrypoint() {
 }
 
 } // namespace trivial_struct_copy
+
+namespace gh153782 {
+
+// Ensure we do not regress on the following use cases.
+// The assumption made on a field in `setPtr` should apply to the returned copy in `func`.
+struct Status { int error; };
+Status getError();
+
+Status setPtr(int **outptr, int* ptr) {
+  Status e = getError();
+  if (e.error != 0) return e; // When assuming the error field is non-zero,
+  *outptr = ptr;              // this is not executed
+  return e;
+}
+
+int func() {
+  int *ptr = nullptr;
+  int x = 42;
+  if (setPtr(&ptr, &x).error == 0) {
+    // The assumption made in get() SHOULD match the assumption about
+    // the returned value, hence the engine SHOULD NOT assume ptr is null.
+    clang_analyzer_dump_val(ptr); // expected-warning {{&x}}
+    return *ptr;
+  }
+  return 0;
+}
+
+} // namespace gh153782
diff --git a/clang/test/Analysis/explain-svals.cpp b/clang/test/Analysis/explain-svals.cpp
index dfc650223c9e7..9474aa7c7dbb1 100644
--- a/clang/test/Analysis/explain-svals.cpp
+++ b/clang/test/Analysis/explain-svals.cpp
@@ -99,7 +99,7 @@ class C {
 } // end of anonymous namespace
 
 void test_6() {
-  clang_analyzer_explain(conjure_S()); // expected-warning-re{{{{^symbol of type 'int' conjured at CFG element 'conjure_S\(\) \(CXXRecordTypedCall, \+0\)'$}}}}
+  clang_analyzer_explain(conjure_S()); // expected-warning-re{{{{^lazily frozen compound value of 1st parameter of function 'clang_analyzer_explain\(\)'$}}}}
   clang_analyzer_explain(conjure_S().z); // expected-warning-re{{{{^value derived from \(symbol of type 'int' conjured at CFG element 'conjure_S\(\) \(CXXRecordTypedCall, \)'\) for field 'z' of temporary object constructed at statement 'conjure_S\(\)'$}}}}
 }
 
diff --git a/clang/test/Analysis/iterator-modeling.cpp b/clang/test/Analysis/iterator-modeling.cpp
index 78882da4431fd..f1538839d06c8 100644
--- a/clang/test/Analysis/iterator-modeling.cpp
+++ b/clang/test/Analysis/iterator-modeling.cpp
@@ -2035,7 +2035,6 @@ void print_state(std::vector<int> &V) {
   // CHECK:      "checker_messages": [
   // CHECK:   { "checker": "alpha.cplusplus.IteratorModeling", "messages": [
   // CHECK-NEXT:     "Iterator Positions :",
-  // CHECK-NEXT:     "conj_$[[#]]{int, LC[[#]], S[[#]], #[[#]]} : Valid ; Container == SymRegion{reg_$[[#]]<std::vector<int> & V>} ; Offset == conj_$[[#]]{long, LC[[#]], S[[#]], #[[#]]}",
   // CHECK-NEXT:     "i0 : Valid ; Container == SymRegion{reg_$[[#]]<std::vector<int> & V>} ; Offset == conj_$[[#]]{long, LC[[#]], S[[#]], #[[#]]}"
   // CHECK-NEXT:   ]}
 
@@ -2046,7 +2045,6 @@ void print_state(std::vector<int> &V) {
   // CHECK:      "checker_messages": [
   // CHECK:   { "checker": "alpha.cplusplus.IteratorModeling", "messages": [
   // CHECK-NEXT:     "Iterator Positions :",
-  // CHECK-NEXT:     "conj_$[[#]]{int, LC[[#]], S[[#]], #[[#]]} : Valid ; Container == SymRegion{reg_$[[#]]<std::vector<int> & V>} ; Offset == conj_$[[#]]{long, LC[[#]], S[[#]], #[[#]]}",
   // CHECK-NEXT:     "i1 : Valid ; Container == SymRegion{reg_$[[#]]<std::vector<int> & V>} ; Offset == conj_$[[#]]{long, LC[[#]], S[[#]], #[[#]]}"
   // CHECK-NEXT:   ]}
 
diff --git a/clang/test/Analysis/stl-algorithm-modeling-aggressive-std-find-modeling.cpp b/clang/test/Analysis/stl-algorithm-modeling-aggressive-std-find-modeling.cpp
index 191af95cd2b9c..98301cf7274fc 100644
--- a/clang/test/Analysis/stl-algorithm-modeling-aggressive-std-find-modeling.cpp
+++ b/clang/test/Analysis/stl-algorithm-modeling-aggressive-std-find-modeling.cpp
@@ -4,16 +4,6 @@
 // RUN:  -analyzer-config alpha.cplusplus.STLAlgorithmModeling:AggressiveStdFindModeling=true\
 // RUN:  -verify
 
-// STLAlgorithmModeling and DebugIteratorModeling are probably bugged because
-// these tests started failing after we just directly copy the symbol
-// representing the value of a variable instead of creating a LazyCompoundVal
-// of that single conjured value.
-// In theory, it shouldn't matter if we eagerly copy the value that we would
-// "load" from the LCV once requested or just directly binding the backing symbol.
-// Yet, these tests fail, so there is likely messed up how/what the checker
-// metadata is associated with.
-// XFAIL: *
-
 #include "Inputs/system-header-simulator-cxx.h"
 
 void clang_analyzer_eval(bool);
diff --git a/clang/test/Analysis/stl-algorithm-modeling.cpp b/clang/test/Analysis/stl-algorithm-modeling.cpp
index f7029c79b0942..5549c24a8c220 100644
--- a/clang/test/Analysis/stl-algorithm-modeling.cpp
+++ b/clang/test/Analysis/stl-algorithm-modeling.cpp
@@ -3,16 +3,6 @@
 // RUN:  -analyzer-config aggressive-binary-operation-simplification=true\
 // RUN:  -verify
 
-// STLAlgorithmModeling and DebugIteratorModeling are probably bugged because
-// these tests started failing after we just directly copy the symbol
-// representing the value of a variable instead of creating a LazyCompoundVal
-// of that single conjured value.
-// In theory, it shouldn't matter if we eagerly copy the value that we would
-// "load" from the LCV once requested or just directly binding the backing symbol.
-// Yet, these tests fail, so there is likely messed up how/what the checker
-// metadata is associated with.
-// XFAIL: *
-
 #include "Inputs/system-header-simulator-cxx.h"
 
 void clang_analyzer_eval(bool);
diff --git a/clang/test/Analysis/store-dump-orders.cpp b/clang/test/Analysis/store-dump-orders.cpp
index dbe93f1c5183a..d99f581f00fe1 100644
--- a/clang/test/Analysis/store-dump-orders.cpp
+++ b/clang/test/Analysis/store-dump-orders.cpp
@@ -41,7 +41,7 @@ void test_output(int n) {
   // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "conj_$
   // CHECK-NEXT:    ]},
   // CHECK-NEXT:    { "cluster": "objfirst", "pointer": "0x{{[0-9a-f]+}}", "items": [
-  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "conj_$
+  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "lazyCompoundVal
   // CHECK-NEXT:      { "kind": "Direct", "offset": 320, "value": "1 S32b" },
   // CHECK-NEXT:      { "kind": "Direct", "offset": 352, "value": "2 S32b" },
   // CHECK-NEXT:      { "kind": "Direct", "offset": 384, "value": "3 S32b" }
diff --git a/clang/test/Analysis/taint-generic.cpp b/clang/test/Analysis/taint-generic.cpp
index fc7c37300d3fc..4b8d9ab68ff84 100644
--- a/clang/test/Analysis/taint-generic.cpp
+++ b/clang/test/Analysis/taint-generic.cpp
@@ -158,7 +158,11 @@ void top() {
   clang_analyzer_isTainted(E); // expected-warning {{NO}}
 
   Aggr A = mySource1<Aggr>();
-  clang_analyzer_isTainted(A);      // expected-warning {{YES}}
+  // FIXME Ideally, both A and A.data should be tainted. However, the
+  //       implementation used by e5ac9145ba29 ([analyzer][taint] Recognize
+  //       tainted LazyCompoundVals (4/4) (#115919), 2024-11-15) led to FPs and
+  //       FNs in various scenarios and had to be reverted to fix #153782.
+  clang_analyzer_isTainted(A);      // expected-warning {{NO}}
   clang_analyzer_isTainted(A.data); // expected-warning {{YES}}
 }
 } // namespace gh114270
diff --git a/clang/test/Analysis/template-param-objects.cpp b/clang/test/Analysis/template-param-objects.cpp
index b065f8756d4d8..dde95fa62cb65 100644
--- a/clang/test/Analysis/template-param-objects.cpp
+++ b/clang/test/Analysis/template-param-objects.cpp
@@ -11,7 +11,7 @@ bool operator ==(Box lhs, Box rhs) {
   return lhs.value == rhs.value;
 }
 template <Box V> void dumps() {
-  clang_analyzer_dump(V);        // expected-warning {{Unknown}}
+  clang_analyzer_dump(V);        // expected-warning {{lazyCompoundVal}}
   clang_analyzer_dump(&V);       // expected-warning {{Unknown}}
   clang_analyzer_dump(V.value);  // expected-warning {{Unknown}} FIXME: It should be '6 S32b'.
   clang_analyzer_dump(&V.value); // expected-warning {{Unknown}}

@mantognini mantognini requested a review from steakhal October 14, 2025 21:54
@steakhal
Copy link
Contributor

The revert itself looks correct. I hate the fact that we need to revert these.

The problem essentially boils down to the following:
We have some object, and we refer to a field of it.
Then we copy the object and refer to the field of the copy; and reading from that field should result in the same value as if the read happened on the original field.
This assumption broke with the patches that now gets reverted.

For example:

namespace gh153782 {
struct Status { int error; };
Status getError();

Status setPtr(int **outptr, int* ptr) {
  Status e = getError();
  if (e.error != 0) return e; // When assuming the error field is non-zero,
  *outptr = ptr;              // this is not executed
  return e;
}

int func() {
  int *ptr = nullptr;
  int x = 42;
  if (setPtr(&ptr, &x).error == 0) {
    clang_analyzer_dump_val(ptr); // expected-warning {{&x}}
    return *ptr;
  }
  return 0;
}

} // namespace gh153782

In this example, withing setPtr we constrain e.error, then we return e and copy it into the temporary at the callsite.
So the setPtr(&ptr, &x).error should return the same symbol that was previously constrained.

Why is it different then?
The value of e was default binding of conjured symbol from getError(), so e.error was a slice of that one, represented by SymbolDerived{conj, e.error}.
The return value of setPtr() is a temporary, thus represented by CXXTempObjectRegion. That is the region the call return should copy into in the return e; statement. Previously, this was represented by a LazyCompoundVal{e}. With the LCV optimizations, this was simplified into conj directly.

Consequently, when later the error field was accessed on the temporary, then SymbolDerived{conj, temp_object{}.error} was created Notice that the MemRegion is different!, so the Profile() of these will be different, so we have different derived symbols...

In theory, both e.error and temp_object{}.error should boil down into the same begin offset and extent (in other words bit sequence) of the same conjured symbol. So these should mean the same thing!

Changing SymbolDerived by replacing the MemRegion with an offset-extent pair, would be challenging.

How did this work with LCVs before?
It seems like in clang-19, the LCV{e} was desugared into SymbolDerived{conj, e.error}. I suppose it was possible to do because LCVs were handled specially, so when loading from the temp_object it didn't create a SymbolDerived{conj, temp_object.error}, but SymbolDerived{conj, e.error} instead because it knew that the LCV it was traversing was for e.
After the LCV optimization, we no longer have the origin region e, so this custom handling of LCVs didn't kick in, so we end up with a brand new SymbolDerived. I have not checked this theory, so treat it with a pinch of salt.

I'm pretty sure we could fix this, but I do not have the time right now. And to be honest, I don't think I'd invest here even in the future. So for now, let's just do the revert :|


Some extra notes about the other regression tests on how they are similar to the case before:

namespace mutually_exclusive_test_case_3 {
struct StorageWrapper {
  // Imagine those two call a reset() function (among other things)
  ~StorageWrapper() { delete parts; }
  StorageWrapper(StorageWrapper const&) = default;

  // Mind that there is no assignment here -- this is the bug we would like to find.
  void operator=(StorageWrapper&&) { delete parts; } // newdelete-warning{{Attempt to release already released memory}}

  // Not provided, typically would do `parts = new long`.
  StorageWrapper();

  long* parts;
};

struct TestDoubleFreeWithInitializerList {
  StorageWrapper* Object;
  TestDoubleFreeWithInitializerList()
  : Object(new StorageWrapper[]{StorageWrapper()}) {
    Object[0] = StorageWrapper(); // This assignment leads to the double-free.
  }
};
} // mutually_exclusive_test_case_3

} // namespace gh153782

In the new StorageWrapper[]{StorageWrapper()} we create a temporary that we copy into the heap allocated array, then the temporary dies and its dtor deletes its parts pointer. Consequently, the copy on the heap will have a dangling pointer.
Then Object[0] = StorageWrapper(); is just a fancy way of spelling delete Object[0].parts, which would be the second time that the same parts pointer is deleted, leading to the double free.
The point is, that we end up with different symbols for the parts subobject at the two delete expressions:
derived_$8 {conj_$7,temp_object{struct StorageWrapper, S2121}.parts} and
derived_$12{conj_$7,Element{HeapSymRegion{conj_$3},0 S64b,struct StorageWrapper}.parts}
The idea is the same.

@steakhal steakhal requested review from NagyDonat and haoNoQ October 15, 2025 13:09
@steakhal
Copy link
Contributor

@haoNoQ I'd be interested in your opinion. I tried to sum up my investigation, but feel free to ask questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[analyzer] False Positives due to conflicting assumptions in callee and caller about returned value's field

3 participants