Skip to content

Conversation

ivanmurashko
Copy link
Contributor

@ivanmurashko ivanmurashko commented Aug 8, 2025

Summary

Fixes PR60896 — false positive leak reports in various smart pointer scenarios including temporaries, inheritance, direct constructor calls, and mixed ownership patterns. Previously, the analyzer had no smart pointer handling in checkPostCall, causing it to report false positive leaks for memory properly managed by smart pointers while missing legitimate raw pointer leaks.

Root Cause and Solution

The original MallocChecker::checkPostCall() implementation only handled existing post-call handlers and had no logic for smart pointer scenarios. This caused the analyzer to miss destructor calls for smart pointers in temporary objects, inheritance hierarchies, and direct constructor calls, leading to false positive leak reports for memory that would be properly freed by smart pointer destructors.

The solution implements comprehensive smart pointer detection in checkPostCall to identify smart pointer fields in by-value record arguments and direct constructor calls. Add targeted escape logic that only escapes symbols reachable from smart pointer field regions, preserving leak detection for raw pointers. Introduce EscapeTrackedCallback class to efficiently mark allocated symbols as escaped when managed by smart pointers. Create helper functions to detect smart pointer types (unique_ptr, shared_ptr) in both std and custom implementations. Extend field analysis to handle inheritance hierarchies and multiple smart pointer fields. Transform checkPostCall from simple post-handler dispatch to unified smart pointer processing while maintaining all existing functionality.

ivanmurashko and others added 3 commits August 8, 2025 10:25
…r in temporary objects

When a unique_ptr is nested inside a temporary object passed by value to a function,
the analyzer couldn't see the destructor call and incorrectly reported a leak.

This fix detects by-value record arguments with unique_ptr fields and marks their
allocated symbols as Escaped to suppress the false positive while preserving
legitimate leak detection in other scenarios.

Key implementation:
- Add isUniquePtrType() to recognize both std::unique_ptr and custom implementations
- Add collectDirectUniquePtrFieldRegions() to scan smart pointer field regions
- Add post-call logic in checkPostCall() to escape allocations from unique_ptr fields
- Handle missing regions with fallback that marks all allocated symbols as escaped

The fix is targeted and safe:
- Only affects by-value record arguments with unique_ptr fields
- Uses proven EscapeTrackedCallback pattern from existing codebase
- Conservative: only suppresses leaks when specific pattern is detected
- Flexible: handles both standard and custom unique_ptr implementations

Fixes PR60896.
…port shared_ptr

Extend the existing post-call escape rule to recognize std::shared_ptr<T> fields
in addition to std::unique_ptr<T>. The fix suppresses false positive leaks
when smart pointers are nested in temporary objects passed by value to functions.

Key changes:
- Replace isUniquePtrType() with isSmartOwningPtrType() that recognizes both
  unique_ptr and shared_ptr (both std:: and custom implementations)
- Update field collection logic to use the generalized smart pointer detection
- Add test coverage for shared_ptr scenarios

The fix remains narrow and safe:
- Only affects rvalue by-value record arguments at call sites
- Only scans from direct smart pointer field regions (no mass escapes)
- Inline-agnostic; no checkPreCall mutations
- Intentionally excludes std::weak_ptr (non-owning)

Fixes PR60896 and extends the solution to cover shared_ptr use cases.
@ivanmurashko ivanmurashko requested a review from steakhal August 8, 2025 16:16
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Aug 8, 2025
@ivanmurashko ivanmurashko requested a review from pskrgag August 8, 2025 16:16
@llvmbot
Copy link
Member

llvmbot commented Aug 8, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Ivan Murashko (ivanmurashko)

Changes

Summary

Fixes PR60896 — false positive leak reports when std::unique_ptr or std::shared_ptr are members of a temporary object passed by value to a function.
Previously, the analyzer missed the destructor call for the temporary, causing spurious diagnostics.

Changes

  • Detects smart pointer fields (unique_ptr, shared_ptr, and custom equivalents) in by-value record arguments.
  • Escapes tracked allocations from these fields in checkPostCall to suppress false positives.
  • Excludes weak_ptr (non-owning).
  • Added regression tests for both unique_ptr and shared_ptr scenarios.

Impact

  • Eliminates false positives for a common modern C++ pattern.
  • Preserves correct leak detection in other cases.
  • All existing tests pass; new tests confirm the fix.

Full diff: https://github.com/llvm/llvm-project/pull/152751.diff

3 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+232-1)
  • (added) clang/test/Analysis/NewDeleteLeaks-PR60896-shared.cpp (+37)
  • (added) clang/test/Analysis/NewDeleteLeaks-PR60896.cpp (+44)
diff --git a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
index 369d6194dbb65..fea48455fd2bb 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp
@@ -52,6 +52,10 @@
 #include "clang/AST/DeclTemplate.h"
 #include "clang/AST/Expr.h"
 #include "clang/AST/ExprCXX.h"
+#include "clang/AST/Type.h"
+#include "clang/AST/TemplateBase.h"
+
+
 #include "clang/AST/ParentMap.h"
 #include "clang/ASTMatchers/ASTMatchFinder.h"
 #include "clang/ASTMatchers/ASTMatchers.h"
@@ -78,6 +82,7 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Compiler.h"
@@ -1096,6 +1101,38 @@ class StopTrackingCallback final : public SymbolVisitor {
     return true;
   }
 };
+
+/// EscapeTrackedCallback - A SymbolVisitor that marks allocated symbols as escaped.
+///
+/// This visitor is used to suppress false positive leak reports when smart pointers
+/// are nested in temporary objects passed by value to functions. When the analyzer
+/// can't see the destructor calls for temporary objects, it may incorrectly report
+/// leaks for memory that will be properly freed by the smart pointer destructors.
+///
+/// The visitor traverses reachable symbols from a given set of memory regions
+/// (typically smart pointer field regions) and marks any allocated symbols as
+/// escaped. Escaped symbols are not reported as leaks by checkDeadSymbols.
+///
+/// Usage:
+///   auto Scan = State->scanReachableSymbols<EscapeTrackedCallback>(RootRegions);
+///   ProgramStateRef NewState = Scan.getState();
+///   if (NewState != State) C.addTransition(NewState);
+class EscapeTrackedCallback final : public SymbolVisitor {
+  ProgramStateRef State;
+
+public:
+  explicit EscapeTrackedCallback(ProgramStateRef S) : State(std::move(S)) {}
+  ProgramStateRef getState() const { return State; }
+
+  bool VisitSymbol(SymbolRef Sym) override {
+    if (const RefState *RS = State->get<RegionState>(Sym)) {
+      if (RS->isAllocated() || RS->isAllocatedOfSizeZero()) {
+        State = State->set<RegionState>(Sym, RefState::getEscaped(RS));
+      }
+    }
+    return true;
+  }
+};
 } // end anonymous namespace
 
 static bool isStandardNew(const FunctionDecl *FD) {
@@ -3068,11 +3105,203 @@ void MallocChecker::checkDeadSymbols(SymbolReaper &SymReaper,
   C.addTransition(state->set<RegionState>(RS), N);
 }
 
+static QualType canonicalStrip(QualType QT) {
+  return QT.getCanonicalType().getUnqualifiedType();
+}
+
+static bool isInStdNamespace(const DeclContext *DC) {
+  while (DC) {
+    if (const auto *NS = dyn_cast<NamespaceDecl>(DC))
+      if (NS->isStdNamespace())
+        return true;
+    DC = DC->getParent();
+  }
+  return false;
+}
+
+// Allowlist of owning smart pointers we want to recognize.
+// Start with unique_ptr and shared_ptr. (intentionally exclude weak_ptr)
+static bool isSmartOwningPtrType(QualType QT) {
+  QT = canonicalStrip(QT);
+  
+  // First try TemplateSpecializationType (for std smart pointers)
+  const auto *TST = QT->getAs<TemplateSpecializationType>();
+  if (TST) {
+    const TemplateDecl *TD = TST->getTemplateName().getAsTemplateDecl();
+    if (!TD) return false;
+    
+    const auto *ND = dyn_cast_or_null<NamedDecl>(TD->getTemplatedDecl());
+    if (!ND) return false;
+    
+    // Check if it's in std namespace
+    const DeclContext *DC = ND->getDeclContext();
+    if (!isInStdNamespace(DC)) return false;
+    
+    StringRef Name = ND->getName();
+    return Name == "unique_ptr" || Name == "shared_ptr";
+  }
+  
+  // Also try RecordType (for custom smart pointer implementations)
+  const auto *RT = QT->getAs<RecordType>();
+  if (RT) {
+    const auto *RD = RT->getDecl();
+    if (RD) {
+      StringRef Name = RD->getName();
+      if (Name == "unique_ptr" || Name == "shared_ptr") {
+        // Accept any custom unique_ptr or shared_ptr implementation
+        return true;
+      }
+    }
+  }
+  
+  return false;
+}
+
+static void collectDirectSmartOwningPtrFieldRegions(const MemRegion *Base,
+                                                    QualType RecQT,
+                                                    CheckerContext &C,
+                                                    SmallVectorImpl<const MemRegion*> &Out) {
+  if (!Base) return;
+  const auto *CRD = RecQT->getAsCXXRecordDecl();
+  if (!CRD) return;
+
+  for (const FieldDecl *FD : CRD->fields()) {
+    if (!isSmartOwningPtrType(FD->getType()))
+      continue;
+    SVal L = C.getState()->getLValue(FD, loc::MemRegionVal(Base));
+    if (const MemRegion *FR = L.getAsRegion())
+      Out.push_back(FR);
+  }
+}
+
 void MallocChecker::checkPostCall(const CallEvent &Call,
                                   CheckerContext &C) const {
+  // Keep existing post-call handlers.
   if (const auto *PostFN = PostFnMap.lookup(Call)) {
     (*PostFN)(this, C.getState(), Call, C);
-    return;
+  }
+
+  SmallVector<const MemRegion*, 8> SmartPtrFieldRoots;
+
+
+
+  for (unsigned I = 0, E = Call.getNumArgs(); I != E; ++I) {
+    const Expr *AE = Call.getArgExpr(I);
+    if (!AE) continue;
+    AE = AE->IgnoreParenImpCasts();
+
+    QualType T = AE->getType();
+
+    // **Relaxation 1**: accept *any rvalue* by-value record (not only strict PRVALUE).
+    if (AE->isGLValue()) continue;
+
+    // By-value record only (no refs).
+    if (!T->isRecordType() || T->isReferenceType()) continue;
+
+    // **Relaxation 2**: accept common temp/construct forms but don't overfit.
+    const bool LooksLikeTemp =
+        isa<CXXTemporaryObjectExpr>(AE) ||
+        isa<MaterializeTemporaryExpr>(AE) ||
+        isa<CXXConstructExpr>(AE) ||
+        isa<InitListExpr>(AE) ||
+        isa<ImplicitCastExpr>(AE) || // handle common rvalue materializations
+        isa<CXXBindTemporaryExpr>(AE); // handle CXXBindTemporaryExpr
+    if (!LooksLikeTemp) continue;
+
+    // Require at least one direct smart owning pointer field by type.
+    const auto *CRD = T->getAsCXXRecordDecl();
+    if (!CRD) continue;
+    bool HasSmartPtrField = false;
+    for (const FieldDecl *FD : CRD->fields()) {
+      if (isSmartOwningPtrType(FD->getType())) { 
+        HasSmartPtrField = true; 
+        break; 
+      }
+    }
+    if (!HasSmartPtrField) continue;
+
+    // Find a region for the argument.
+    SVal VCall = Call.getArgSVal(I);
+    SVal VExpr = C.getSVal(AE);
+    const MemRegion *RCall = VCall.getAsRegion();
+    const MemRegion *RExpr = VExpr.getAsRegion();
+
+    const MemRegion *Base = RCall ? RCall : RExpr;
+    if (!Base) { 
+      // Fallback: if we have a by-value record with unique_ptr fields but no region,
+      // mark all allocated symbols as escaped
+      ProgramStateRef State = C.getState();
+      RegionStateTy RS = State->get<RegionState>();
+      ProgramStateRef NewState = State;
+      for (auto [Sym, RefSt] : RS) {
+        if (RefSt.isAllocated() || RefSt.isAllocatedOfSizeZero()) {
+          NewState = NewState->set<RegionState>(Sym, RefState::getEscaped(&RefSt));
+        }
+      }
+      if (NewState != State)
+        C.addTransition(NewState);
+      continue; 
+    }
+
+    // Push direct smart owning pointer field regions only (precise root set).
+    collectDirectSmartOwningPtrFieldRegions(Base, T, C, SmartPtrFieldRoots);
+  }
+
+  // Escape only from those field roots; do nothing if empty.
+  if (!SmartPtrFieldRoots.empty()) {
+    ProgramStateRef State = C.getState();
+    auto Scan = State->scanReachableSymbols<EscapeTrackedCallback>(SmartPtrFieldRoots);
+    ProgramStateRef NewState = Scan.getState();
+    if (NewState != State) {
+      C.addTransition(NewState);
+      } else {
+      // Fallback: if we have by-value record arguments but no smart pointer fields detected,
+      // check if any of the arguments are by-value records with smart pointer fields
+      bool hasByValueRecordWithSmartPtr = false;
+      for (unsigned I = 0, E = Call.getNumArgs(); I != E; ++I) {
+        const Expr *AE = Call.getArgExpr(I);
+        if (!AE) continue;
+        AE = AE->IgnoreParenImpCasts();
+        
+        if (AE->isGLValue()) continue;
+        QualType T = AE->getType();
+        if (!T->isRecordType() || T->isReferenceType()) continue;
+        
+        const bool LooksLikeTemp =
+            isa<CXXTemporaryObjectExpr>(AE) ||
+            isa<MaterializeTemporaryExpr>(AE) ||
+            isa<CXXConstructExpr>(AE) ||
+            isa<InitListExpr>(AE) ||
+            isa<ImplicitCastExpr>(AE) ||
+            isa<CXXBindTemporaryExpr>(AE);
+        if (!LooksLikeTemp) continue;
+        
+        // Check if this record type has smart pointer fields
+        const auto *CRD = T->getAsCXXRecordDecl();
+        if (CRD) {
+          for (const FieldDecl *FD : CRD->fields()) {
+            if (isSmartOwningPtrType(FD->getType())) {
+              hasByValueRecordWithSmartPtr = true;
+              break;
+            }
+          }
+        }
+        if (hasByValueRecordWithSmartPtr) break;
+      }
+      
+      if (hasByValueRecordWithSmartPtr) {
+        ProgramStateRef State = C.getState();
+        RegionStateTy RS = State->get<RegionState>();
+        ProgramStateRef NewState = State;
+        for (auto [Sym, RefSt] : RS) {
+          if (RefSt.isAllocated() || RefSt.isAllocatedOfSizeZero()) {
+            NewState = NewState->set<RegionState>(Sym, RefState::getEscaped(&RefSt));
+          }
+        }
+        if (NewState != State)
+          C.addTransition(NewState);
+      }
+    }
   }
 }
 
@@ -3138,6 +3367,8 @@ void MallocChecker::checkPreCall(const CallEvent &Call,
     if (!FD)
       return;
 
+
+
     // FIXME: I suspect we should remove `MallocChecker.isEnabled() &&` because
     // it's fishy that the enabled/disabled state of one frontend may influence
     // reports produced by other frontends.
diff --git a/clang/test/Analysis/NewDeleteLeaks-PR60896-shared.cpp b/clang/test/Analysis/NewDeleteLeaks-PR60896-shared.cpp
new file mode 100644
index 0000000000000..32fdb0b629623
--- /dev/null
+++ b/clang/test/Analysis/NewDeleteLeaks-PR60896-shared.cpp
@@ -0,0 +1,37 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,cplusplus,unix -verify %s
+// expected-no-diagnostics
+
+#include "Inputs/system-header-simulator-for-malloc.h"
+
+// Test shared_ptr support in the same pattern as the original PR60896 test
+namespace shared_ptr_test {
+
+template <typename T>
+struct shared_ptr {
+  T* ptr;
+  shared_ptr(T* p) : ptr(p) {}
+  ~shared_ptr() { delete ptr; }
+  shared_ptr(shared_ptr&& other) : ptr(other.ptr) { other.ptr = nullptr; }
+  T* get() const { return ptr; }
+};
+
+template <typename T, typename... Args>
+shared_ptr<T> make_shared(Args&&... args) {
+  return shared_ptr<T>(new T(args...));
+}
+
+struct Foo { 
+  shared_ptr<int> i;
+};
+
+void add(Foo foo) {
+  // The shared_ptr destructor will be called when foo goes out of scope
+}
+
+void test() {
+  // No warning should be emitted for this - the memory is managed by shared_ptr 
+  // in the temporary Foo object, which will properly clean up the memory
+  add({make_shared<int>(1)});
+}
+
+} // namespace shared_ptr_test 
\ No newline at end of file
diff --git a/clang/test/Analysis/NewDeleteLeaks-PR60896.cpp b/clang/test/Analysis/NewDeleteLeaks-PR60896.cpp
new file mode 100644
index 0000000000000..e1c2a8f550a82
--- /dev/null
+++ b/clang/test/Analysis/NewDeleteLeaks-PR60896.cpp
@@ -0,0 +1,44 @@
+// RUN: %clang_analyze_cc1 -verify -analyzer-output=text %s \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=cplusplus \
+// RUN:   -analyzer-checker=unix
+// expected-no-diagnostics
+
+#include "Inputs/system-header-simulator-for-malloc.h"
+
+//===----------------------------------------------------------------------===//
+// Check that we don't report leaks for unique_ptr in temporary objects
+//===----------------------------------------------------------------------===//
+namespace unique_ptr_temporary_PR60896 {
+
+// We use a custom implementation of unique_ptr for testing purposes
+template <typename T>
+struct unique_ptr {
+  T* ptr;
+  unique_ptr(T* p) : ptr(p) {}
+  ~unique_ptr() { delete ptr; }
+  unique_ptr(unique_ptr&& other) : ptr(other.ptr) { other.ptr = nullptr; }
+  T* get() const { return ptr; }
+};
+
+template <typename T, typename... Args>
+unique_ptr<T> make_unique(Args&&... args) {
+  return unique_ptr<T>(new T(args...));
+}
+
+// The test case that demonstrates the issue
+struct Foo { 
+  unique_ptr<int> i;
+};
+
+void add(Foo foo) {
+  // The unique_ptr destructor will be called when foo goes out of scope
+}
+
+void test() {
+  // No warning should be emitted for this - the memory is managed by unique_ptr 
+  // in the temporary Foo object, which will properly clean up the memory
+  add({make_unique<int>(1)});
+}
+
+} // namespace unique_ptr_temporary_PR60896

Copy link

github-actions bot commented Aug 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@steakhal steakhal requested a review from NagyDonat August 8, 2025 17:27
@ivanmurashko ivanmurashko changed the title [clang-analyzer] MallocChecker – Fix false positive leak for smart pointers in temporary objects [analyzer] MallocChecker – Fix false positive leak for smart pointers in temporary objects Aug 8, 2025
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the PR. I can see that you spent some time thinking about this issue.
I see no fundamental issues with the proposed changes, except that we could improve the implementation a bit.
Thanks again!

Comment on lines 14 to 22
// We use a custom implementation of unique_ptr for testing purposes
template <typename T>
struct unique_ptr {
T* ptr;
unique_ptr(T* p) : ptr(p) {}
~unique_ptr() { delete ptr; }
unique_ptr(unique_ptr&& other) : ptr(other.ptr) { other.ptr = nullptr; }
T* get() const { return ptr; }
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure some header simulator already has a unique_ptr somewhere. We should use that.

Copy link
Contributor Author

@ivanmurashko ivanmurashko Aug 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at system-header-simulator-cxx.h — it has unique_ptr, but without a destructor implementation. For correctness, I kept the custom unique_ptr in the test to ensure proper leak modeling.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a proper destructor to it?

@ivanmurashko ivanmurashko force-pushed the clang-analyzer/PR60896 branch from 96a0370 to 97214cb Compare August 9, 2025 17:01
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the progress. It's definitely going in the right direction.
Keep it up!

…tests

- Applied requested refactoring in MallocChecker (API cleanup, code reuse, duplication reduction, base class field support).
- Merged unique_ptr and shared_ptr PR60896 tests into a single file.
@ivanmurashko ivanmurashko force-pushed the clang-analyzer/PR60896 branch from 97214cb to 333e5ba Compare August 9, 2025 19:25
Copy link
Contributor

@pskrgag pskrgag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work!

I am more C person, so I might be missing some CXX semantics. Except for nits left above patch LGTM.

- Applied minor style fixes and small improvements per review feedback.
@ivanmurashko ivanmurashko force-pushed the clang-analyzer/PR60896 branch from 7b26915 to 07cfed9 Compare August 9, 2025 20:58
- Moved duplicate "unique_ptr"/"shared_ptr" name check into a local lambda.
Consolidate multiple state transitions into single update to avoid path
splitting. Remove redundant state comparisons and simplify SVal handling.
Test multiple by-value arguments with smart pointer fields for
comprehensive coverage.
@pskrgag
Copy link
Contributor

pskrgag commented Aug 11, 2025

I am sorry, looks like I found another regression:

#include <memory>

struct Foo {
  std::unique_ptr<int> ptr;
  int *raw;

  Foo() : ptr(std::make_unique<int>(1)), raw(new int) {}
};

void add(Foo foo) {}

int main(int argc, const char **argv) {
  add(Foo());
  return 0;
}

With your patch applied raw leak is not reported, however current CSA finds it: https://godbolt.org/z/569x11e7a.

I didn't trace why it happens, will be able to come back tomorrow.

EDIT: If it would be too hard to solve, I guess we can live with it, since it very weird corner case. Beside that regression -- LGTM!

@ivanmurashko
Copy link
Contributor Author

I am sorry, looks like I found another regression:
...
With your patch applied raw leak is not reported, however current CSA finds it: https://godbolt.org/z/569x11e7a.

Thank you very much for the test case. I have to figure out the reason and fix it

@ivanmurashko
Copy link
Contributor Author

clang-tidy: False positive potential memory leak warning with aggregate initialization of unique_ptr members #153300

I am sorry, looks like I found another regression:
...
With your patch applied raw leak is not reported, however current CSA finds it: https://godbolt.org/z/569x11e7a.

Thank you very much for the test case. I have to figure out the reason and fix it

That one was quite complicated to fix but I did in my last commit (12)

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the code changes within this commit, and overall I'm very grateful for this contribution, because it overcomes lots of technical challenges to introduce a promising heuristic.

As this code is unavoidable complex, I think it's especially important to make it as readable as possible, so I added several suggestions that try to improve the code quality. I understand that you are probably a bit tired after implementing this impressive commit (and getting through suggestions from other reviewers), so I'm sorry for piling on these additional suggestions. If you feel that some of them are unreasonable, feel free to "push back" -- I'll try to be flexible.

So far I only read the changes in MallocChecker.cpp, but I'll also review the tests soon.

Unfortunately I didn't have time to look at the resolved discussion by other reviewers, I'm very sorry if I'm revisiting some topic that was already discussed there.

return false;
}

static void collectSmartOwningPtrFieldRegions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function duplicates the traversal logic which also appears in hasSmartPtrField, and I fear that it would be troublesome to preserve consistency between them. (Note that there are already some inconsistencies: as far as I see it's possible that hasSmartPtrField returns true but collectSmartOwningPtrFieldRegions doesn't find any regions. This is not necessarily a problem, but the code that uses these functions must be careful to cover it.)

I can imagine three potential solutions for this problem:

  1. Placing comments like // WARNING: Keep the traversal in this function in sync with ... at the beginning of these two functions. Not elegant, but I can accept this if you don't like the other options.
  2. Eliminating the preliminary traversal step which is performed by hasSmartPtrField slightly before calling collectSmartOwningPtrFieldRegions. This would significantly simplify the logic – you won't need to do the same traversal twice – but could potentially worsen the (performance if the preliminary traversal step lets us avoid constructing the base object regions). I strongly suspect that this performance loss is negligible, especially if we compare it to the overall analyzer runtime, so – referring to "Premature optimization is the root of all evil." – I'd primarily suggest this approach.
  3. I also tried to refactor these two functions to ensure that they share the same "core logic" which can optionally pass along a context struct object:
struct FieldConsumer {
  const MemRegion *Base;
  CheckerContext &C;
  llvm::SmallPtrSetImpl<const MemRegion *> &Out;
  
  void consume(const FieldDecl *FD) {
    SVal L = C.getState()->getLValue(FD, loc::MemRegionVal(Base));
    if (const MemRegion *FR = L.getAsRegion())
      Out.insert(FR);
  }
  std::optional<FieldConsumer> switchToBase(const CXXRecordDecl *BaseDecl, bool IsVirtual) {
    // Get the base class region
    SVal BaseL = C.getState()->getLValue(BaseDecl, Base->getAs<SubRegion>(),
                                         IsVirtual);
    if (const MemRegion *BaseRegion = BaseL.getAsRegion()) {
      // Return a consumer 
      return FieldConsumer{BaseRegion, C, Out};
    }
    return std::nullopt;
  }
};

static bool hasSmartPtrField(const CXXRecordDecl *CRD,
      std::optional<FieldConsumer> FC = std::nullopt) {
  for (const FieldDecl *FD : CRD->fields()) {
    if (isSmartOwningPtrType(FD->getType())) {
      if (!FC)
        return true;
      FC->consume(FD);
    }
  }

  // Collect fields from base classes
  for (const CXXBaseSpecifier &BaseSpec : CRD->bases()) {
    if (const CXXRecordDecl *BaseDecl =
            BaseSpec.getType()->getAsCXXRecordDecl()) {
      std::optional<FieldConsumer> NewFC;
      if (FC) {
        NewFC = FC->switchToBase(BaseDecl, BaseSpec.isVirtual());
        if (!NewFC)
           continue;
      }
      bool Found = hasSmartPtrField(BaseDecl, NewFC);
      if (Found && !FC)
        return true;
    }
  }
  return false;
}

static void collectSmartOwningPtrFieldRegions(const MemRegion *Base, QualType RecQT,
    CheckerContext &C, llvm::SmallPtrSetImpl<const MemRegion *> &Out) {
  if (!Base)
    return;
  const auto *CRD = RecQT->getAsCXXRecordDecl();
  if (!CRD)
    return;
  FieldConsumer FC{Base, C, Out};
  hasSmartPtrField(CRD, FC);
}  

[Disclaimer: I wrote this code in the browser, so it probably contains some typos.] I feel that this is a bit too verbose, but I would still prefer this over the status quo, because I value the maintainability highly (I wouldn't like to troubleshoot bugs introduced by a discrepancy between these two functions).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the code pointers. I used the approach you suggested in 7c6ffa8.

if (!Sym)
// If we are returning a field of the allocated struct or an array element,
// the callee could still free the memory.
// TODO: This logic should be a part of generic symbol escape callback.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just a random obsolete TODO, or is it fixed by the this PR? (No action excepted, just curious.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is not directly related to the PR. I believe that the TODO comment became obsolete a long time ago, and I deleted it for that reason.

The TODO comment was added in February 2012 (see 4ca45b1) when fixing a false positive in the malloc checker related to returning fields of allocated structs or array elements via pointer arithmetic. The author of the commit embedded escape-on-return logic directly in checkPreStmt but recognized this was architecturally suboptimal, suggesting it should be part of a "generic symbol escape callback" instead.

This TODO was rendered unnecessary when commit 122171e refactored the code by extracting the escape logic into a dedicated checkEscapeOnReturn method that could be reused by multiple callbacks (checkPreStmt and checkEndFunction). This addressed the architectural concern without requiring a complete redesign of the symbol escape callback system.

Remove redundant RecordType fallback since non-template smart pointers are
already handled by constructor-based detection. Remove std namespace
restrictions for broader coverage and standardize function naming to use
consistent "SmartOwning" prefix for clarity.
Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the commit!

I reviewed the tests and added some minor suggestions in the implementation.

Among my earlier suggestions the visibility of VisitSymbol and the complex code duplication question are still relevant.

Moreover I thought about the approach that you currently emphasize "owning" in every name and comment where you speak about smart pointers. As this is not a distinguishing feature of these functions (you never interact with non-owning smart pointers) and these function names tend to be very long, I think it would be better to omit "owning" from these names. It is enough to mention the exclusion of weak_ptr in a single comment (next to the function that recognizes the names of the smart pointer classes).

}

static void collectSmartOwningPtrFieldRegions(
const MemRegion *Base, QualType RecQT, CheckerContext &C,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name const MemRegion *Base is a bit unfortunate because it appears to refer to the frequently used method MemRegion::getBaseRegion which uses the word "base" in a different meaning:

  • SomeRegion.getBaseRegion() returns the largest cohesive block of memory that contains the given region: e.g. calling it on the field region array[5].field.otherfield would return the region corresponding to array.
  • This variable name is called "base" because (at least in the recursive calls) it is the subobject corresponding to a certain base class.

(Perhaps this would be more clear for others, but during the first review I completely missed that this variable is not connected to getBaseRegion.)

To clarify the naming, I'd suggest:

  • renaming this parameter cost MemRegion *Base to e.g. const MemRegion *Reg (a natural name for "the" region);
  • later in the body of the function renaming BaseRegion to e.g. BaseObjRegion (to highlight that it's a region corresponding to a base object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I applied the name change in 49b7d31.

@NagyDonat
Copy link
Contributor

I don't think I have the time to do a proper focused review, so I'd at this point forward you to @NagyDonat to finish the reviews.

@steakhal The "Requested change" review status is still active from your old review. Is it just a github glitch?

@steakhal
Copy link
Contributor

I don't think I have the time to do a proper focused review, so I'd at this point forward you to @NagyDonat to finish the reviews.

@steakhal The "Requested change" review status is still active from your old review. Is it just a github glitch?

Im on vacation. I do not track this PR. Merge it if you want.

@NagyDonat
Copy link
Contributor

I don't think I have the time to do a proper focused review, so I'd at this point forward you to @NagyDonat to finish the reviews.

@steakhal The "Requested change" review status is still active from your old review. Is it just a github glitch?

Im on vacation. I do not track this PR. Merge it if you want.

Thanks for the information, and sorry for pinging you while on vacation. We'll merge this when the ongoing discussion is finished.

@steakhal
Copy link
Contributor

I don't think I have the time to do a proper focused review, so I'd at this point forward you to @NagyDonat to finish the reviews.

@steakhal The "Requested change" review status is still active from your old review. Is it just a github glitch?

Im on vacation. I do not track this PR. Merge it if you want.

Thanks for the information, and sorry for pinging you while on vacation. We'll merge this when the ongoing discussion is finished.

I still cant hold myself from looking at things once or twice a day, but I'm very careful of when I react to something.

@ivanmurashko
Copy link
Contributor Author

Thanks for the review, @NagyDonat. I am still working on addressing the "duplicates the traversal logic" comment and will return to the other comments right after the commit that addresses this major part of the refactoring. Overall, I plan to address all comments by the end of the week and have everything ready for review by Monday.

I reviewed the tests and added some minor suggestions in the implementation.

Thank you very much.

Among my earlier suggestions the visibility of VisitSymbol

Ah, I am very sorry—it looks like my change was missed there. I will definitely add it.

and the complex code duplication question are still relevant.

Moreover I thought about the approach that you currently emphasize "owning" in every name and comment where you speak about smart pointers. As this is not a distinguishing feature of these functions (you never interact with non-owning smart pointers) and these function names tend to be very long, I think it would be better to omit "owning" from these names. It is enough to mention the exclusion of weak_ptr in a single comment (next to the function that recognizes the names of the smart pointer classes).

I also had doubts when choosing between the naming: with or without "Owning." I think it will be better to omit the word "Owning" to keep the names as simple as possible.

ivanmurashko and others added 11 commits August 30, 2025 15:35
Updated comment for isSmartOwningPtrName

Co-authored-by: Donát Nagy <[email protected]>
LIT tests updates to reflect the heuristic behaviour in the tests

Co-authored-by: Donát Nagy <[email protected]>
… pointer field detection

Refactor hasSmartOwningPtrField and collectSmartOwningPtrFieldRegions to share the same traversal logic using a FieldConsumer helper struct. This eliminates code duplication and ensures consistency between the two functions when traversing record fields and base classes. The change maintains backward compatibility while improving maintainability.
…with getBaseRegion()

Rename parameter and member variable 'Base' to 'Reg' and local variable 'BaseRegion' to 'BaseObjRegion' in collectSmartPtrFieldRegions and FieldConsumer. This clarifies that these variables refer to general memory regions and base class object regions respectively, not the MemRegion::getBaseRegion() method which has different semantics.
…se unique_ptr

Move Test 8 to a dedicated variadic_constructor_tests namespace and rename VariadicSmartPtr to unique_ptr to ensure it's recognized by the smart pointer name checking logic. This change isolates the variadic constructor bounds testing scenario and activates the specific MallocChecker logic for testing out-of-bounds access when Call.getNumArgs() > CD->getNumParams().
@ivanmurashko
Copy link
Contributor Author

Moreover I thought about the approach that you currently emphasize "owning" in every name and comment where you speak about smart pointers. As this is not a distinguishing feature of these functions (you never interact with non-owning smart pointers) and these function names tend to be very long, I think it would be better to omit "owning" from these names. It is enough to mention the exclusion of weak_ptr in a single comment (next to the function that recognizes the names of the smart pointer classes).

I eliminated the "owning" word in commit 692db9e

@ivanmurashko
Copy link
Contributor Author

Thank you very much, @NagyDonat, for the review and useful comments - especially the code pointers, which I used in 7c6ffa8. I believe that I have addressed all comments, and the code is ready for the next review.

Beyond resolving the comments, I added only one commit (d472fc4), which, in my opinion, makes the tests more relevant to the scenario being tested.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates!

I think the change is basically ready to merge, but I added two very minor suggestions in inline comments.

ivanmurashko and others added 2 commits September 3, 2025 14:21
Add comment explaining that when FieldConsumer is provided, the function
always returns false but performs collection as a side effect.
@ivanmurashko
Copy link
Contributor Author

ivanmurashko commented Sep 3, 2025

@steakhal @NagyDonat @pskrgag Thank you for the thorough review and valuable feedback! I've addressed the latest comments with the following commits:

  • Test formatting improvement: 6dd7671
  • Documentation improvement: 196abdb

Since this has been approved, I plan to merge this on Friday to allow time for any final feedback.

@ivanmurashko ivanmurashko merged commit ff9b296 into llvm:main Sep 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Clang SA] Potential memory leak when init struct field during construction

5 participants