Skip to content

Conversation

steakhal
Copy link
Contributor

@steakhal steakhal commented Nov 9, 2024

Dump the memory space clusters before the other clusters, in alphabetical order. Then default bindings over direct bindings, and if any has symbolic offset, then those should come before the ones with concrete offsets.
In theory, we should either have a symbolic offset OR concrete offsets, but never both at the same time.

Needed for #114835

Dump the memory space clusters before the other clusters, in
alphabetical order. Then default bindings over direct bindings, and if
any has symbolic offset, then those should come before the ones with
concrete offsets.
In theory, we should either have a symbolic offset OR concrete offsets,
but never both at the same time.
@steakhal steakhal requested a review from NagyDonat November 9, 2024 18:31
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 9, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balazs Benics (steakhal)

Changes

Dump the memory space clusters before the other clusters, in alphabetical order. Then default bindings over direct bindings, and if any has symbolic offset, then those should come before the ones with concrete offsets.
In theory, we should either have a symbolic offset OR concrete offsets, but never both at the same time.

Needed for #114835


Full diff: https://github.com/llvm/llvm-project/pull/115615.diff

1 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (+73-13)
diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index 674099dd7e1f0f..6bad9a93a30169 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -67,9 +67,10 @@ class BindingKey {
             isa<ObjCIvarRegion, CXXDerivedObjectRegion>(r)) &&
            "Not a base");
   }
-public:
 
+public:
   bool isDirect() const { return P.getInt() & Direct; }
+  bool isDefault() const { return !isDirect(); }
   bool hasSymbolicOffset() const { return P.getInt() & Symbolic; }
 
   const MemRegion *getRegion() const { return P.getPointer(); }
@@ -232,27 +233,86 @@ class RegionBindingsRef : public llvm::ImmutableMapRef<const MemRegion *,
 
   void printJson(raw_ostream &Out, const char *NL = "\n",
                  unsigned int Space = 0, bool IsDot = false) const {
-    for (iterator I = begin(), E = end(); I != E; ++I) {
-      // TODO: We might need a .printJson for I.getKey() as well.
+    using namespace llvm;
+    DenseMap<const MemRegion *, std::string> StringifyCache;
+    auto ToString = [&StringifyCache](const MemRegion *R) {
+      auto [Place, Inserted] = StringifyCache.try_emplace(R);
+      if (!Inserted)
+        return Place->second;
+      std::string Res;
+      raw_string_ostream OS(Res);
+      OS << R;
+      Place->second = Res;
+      return Res;
+    };
+
+    using Cluster =
+        std::pair<const MemRegion *, ImmutableMap<BindingKey, SVal>>;
+    using Binding = std::pair<BindingKey, SVal>;
+
+    const auto MemSpaceBeforeRegionName = [&ToString](const Cluster *L,
+                                                      const Cluster *R) {
+      if (isa<MemSpaceRegion>(L->first) && !isa<MemSpaceRegion>(R->first))
+        return true;
+      if (!isa<MemSpaceRegion>(L->first) && isa<MemSpaceRegion>(R->first))
+        return false;
+      return ToString(L->first) < ToString(R->first);
+    };
+
+    const auto SymbolicBeforeOffset = [&ToString](const BindingKey &L,
+                                                  const BindingKey &R) {
+      if (L.hasSymbolicOffset() && !R.hasSymbolicOffset())
+        return true;
+      if (!L.hasSymbolicOffset() && R.hasSymbolicOffset())
+        return false;
+      if (L.hasSymbolicOffset() && R.hasSymbolicOffset())
+        return ToString(L.getRegion()) < ToString(R.getRegion());
+      return L.getOffset() < R.getOffset();
+    };
+
+    const auto DefaultBindingBeforeDirectBindings =
+        [&SymbolicBeforeOffset](const Binding *LPtr, const Binding *RPtr) {
+          const BindingKey &L = LPtr->first;
+          const BindingKey &R = RPtr->first;
+          if (L.isDefault() && !R.isDefault())
+            return true;
+          if (!L.isDefault() && R.isDefault())
+            return false;
+          assert(L.isDefault() == R.isDefault());
+          return SymbolicBeforeOffset(L, R);
+        };
+
+    const auto AddrOf = [](const auto &Item) { return &Item; };
+
+    std::vector<const Cluster *> SortedClusters;
+    SortedClusters.reserve(std::distance(begin(), end()));
+    append_range(SortedClusters, map_range(*this, AddrOf));
+    llvm::sort(SortedClusters, MemSpaceBeforeRegionName);
+
+    for (auto [Idx, C] : llvm::enumerate(SortedClusters)) {
+      const auto &[BaseRegion, Bindings] = *C;
       Indent(Out, Space, IsDot)
-          << "{ \"cluster\": \"" << I.getKey() << "\", \"pointer\": \""
-          << (const void *)I.getKey() << "\", \"items\": [" << NL;
+          << "{ \"cluster\": \"" << BaseRegion << "\", \"pointer\": \""
+          << (const void *)BaseRegion << "\", \"items\": [" << NL;
+
+      std::vector<const Binding *> SortedBindings;
+      SortedBindings.reserve(std::distance(Bindings.begin(), Bindings.end()));
+      append_range(SortedBindings, map_range(Bindings, AddrOf));
+      llvm::sort(SortedBindings, DefaultBindingBeforeDirectBindings);
 
       ++Space;
-      const ClusterBindings &CB = I.getData();
-      for (ClusterBindings::iterator CI = CB.begin(), CE = CB.end(); CI != CE;
-           ++CI) {
-        Indent(Out, Space, IsDot) << "{ " << CI.getKey() << ", \"value\": ";
-        CI.getData().printJson(Out, /*AddQuotes=*/true);
+      for (auto [Idx, B] : llvm::enumerate(SortedBindings)) {
+        const auto &[Key, Value] = *B;
+        Indent(Out, Space, IsDot) << "{ " << Key << ", \"value\": ";
+        Value.printJson(Out, /*AddQuotes=*/true);
         Out << " }";
-        if (std::next(CI) != CE)
+        if (Idx != SortedBindings.size() - 1)
           Out << ',';
         Out << NL;
       }
-
       --Space;
       Indent(Out, Space, IsDot) << "]}";
-      if (std::next(I) != E)
+      if (Idx != SortedClusters.size() - 1)
         Out << ',';
       Out << NL;
     }

void printJson(raw_ostream &Out, const char *NL = "\n",
unsigned int Space = 0, bool IsDot = false) const {
for (iterator I = begin(), E = end(); I != E; ++I) {
// TODO: We might need a .printJson for I.getKey() as well.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped this TODO. It's not relevant. The key is always a base memory region, which obeys the json rules.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, feel free to add these NFC improvements.

My only comment is a suggestion to define the comparison logic with a higher-level approach -- but the existing code is also OK if you prefer that.

@steakhal steakhal merged commit e05d91b into llvm:main Nov 12, 2024
5 of 7 checks passed
@steakhal steakhal deleted the bb/region-store-deterministic-dumps branch November 12, 2024 13:50
steakhal added a commit that referenced this pull request Nov 12, 2024
steakhal added a commit that referenced this pull request Nov 12, 2024
Reverts #115615

There are two problems with this PR:
1) If any of the dumps contains a store with a symbolic binding, we
crash.
2) The memory space clusters come last among the clusters, which is not
what I intended.

I'm reverting because of the crash.
steakhal added a commit that referenced this pull request Nov 12, 2024
This is reapplies #115615 without using tuples. The eager call of
`getRegion()` and `getOffset()` could cause crashes when the Store had
symbolic bindings.

Here I'm fixing the crash by lazily calling those getters.

Also, the tuple version poorly sorted the Clusters. The memory spaces
should have come before the regular clusters.
Now, that is also fixed here, demonstrated by the test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants