Skip to content

Conversation

@Michael137
Copy link
Member

Part of #149827

Allows us to use the mangling substitution facilities in CPlusPlusLanguage but also SymbolFileDWARF.

Added tests now that they're "public".

Part of llvm#149827

Allows us to use the mangling substitution facilities in
CPlusPlusLanguage but also SymbolFileDWARF.

Added tests now that they're "public".
@llvmbot
Copy link
Member

llvmbot commented Aug 26, 2025

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

Part of #149827

Allows us to use the mangling substitution facilities in CPlusPlusLanguage but also SymbolFileDWARF.

Added tests now that they're "public".


Full diff: https://github.com/llvm/llvm-project/pull/155483.diff

4 Files Affected:

  • (modified) lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp (+188-134)
  • (modified) lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h (+13)
  • (modified) lldb/unittests/Language/CPlusPlus/CMakeLists.txt (+1)
  • (modified) lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp (+158)
diff --git a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
index b4207439f5285..c39b529f7305a 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
@@ -604,126 +604,6 @@ bool CPlusPlusLanguage::ExtractContextAndIdentifier(
   return false;
 }
 
-namespace {
-class NodeAllocator {
-  llvm::BumpPtrAllocator Alloc;
-
-public:
-  void reset() { Alloc.Reset(); }
-
-  template <typename T, typename... Args> T *makeNode(Args &&...args) {
-    return new (Alloc.Allocate(sizeof(T), alignof(T)))
-        T(std::forward<Args>(args)...);
-  }
-
-  void *allocateNodeArray(size_t sz) {
-    return Alloc.Allocate(sizeof(llvm::itanium_demangle::Node *) * sz,
-                          alignof(llvm::itanium_demangle::Node *));
-  }
-};
-
-template <typename Derived>
-class ManglingSubstitutor
-    : public llvm::itanium_demangle::AbstractManglingParser<Derived,
-                                                            NodeAllocator> {
-  using Base =
-      llvm::itanium_demangle::AbstractManglingParser<Derived, NodeAllocator>;
-
-public:
-  ManglingSubstitutor() : Base(nullptr, nullptr) {}
-
-  template <typename... Ts>
-  ConstString substitute(llvm::StringRef Mangled, Ts &&...Vals) {
-    this->getDerived().reset(Mangled, std::forward<Ts>(Vals)...);
-    return substituteImpl(Mangled);
-  }
-
-protected:
-  void reset(llvm::StringRef Mangled) {
-    Base::reset(Mangled.begin(), Mangled.end());
-    Written = Mangled.begin();
-    Result.clear();
-    Substituted = false;
-  }
-
-  ConstString substituteImpl(llvm::StringRef Mangled) {
-    Log *log = GetLog(LLDBLog::Language);
-    if (this->parse() == nullptr) {
-      LLDB_LOG(log, "Failed to substitute mangling in {0}", Mangled);
-      return ConstString();
-    }
-    if (!Substituted)
-      return ConstString();
-
-    // Append any trailing unmodified input.
-    appendUnchangedInput();
-    LLDB_LOG(log, "Substituted mangling {0} -> {1}", Mangled, Result);
-    return ConstString(Result);
-  }
-
-  void trySubstitute(llvm::StringRef From, llvm::StringRef To) {
-    if (!llvm::StringRef(currentParserPos(), this->numLeft()).starts_with(From))
-      return;
-
-    // We found a match. Append unmodified input up to this point.
-    appendUnchangedInput();
-
-    // And then perform the replacement.
-    Result += To;
-    Written += From.size();
-    Substituted = true;
-  }
-
-private:
-  /// Input character until which we have constructed the respective output
-  /// already.
-  const char *Written = "";
-
-  llvm::SmallString<128> Result;
-
-  /// Whether we have performed any substitutions.
-  bool Substituted = false;
-
-  const char *currentParserPos() const { return this->First; }
-
-  void appendUnchangedInput() {
-    Result +=
-        llvm::StringRef(Written, std::distance(Written, currentParserPos()));
-    Written = currentParserPos();
-  }
-};
-
-/// Given a mangled function `Mangled`, replace all the primitive function type
-/// arguments of `Search` with type `Replace`.
-class TypeSubstitutor : public ManglingSubstitutor<TypeSubstitutor> {
-  llvm::StringRef Search;
-  llvm::StringRef Replace;
-
-public:
-  void reset(llvm::StringRef Mangled, llvm::StringRef Search,
-             llvm::StringRef Replace) {
-    ManglingSubstitutor::reset(Mangled);
-    this->Search = Search;
-    this->Replace = Replace;
-  }
-
-  llvm::itanium_demangle::Node *parseType() {
-    trySubstitute(Search, Replace);
-    return ManglingSubstitutor::parseType();
-  }
-};
-
-class CtorDtorSubstitutor : public ManglingSubstitutor<CtorDtorSubstitutor> {
-public:
-  llvm::itanium_demangle::Node *
-  parseCtorDtorName(llvm::itanium_demangle::Node *&SoFar, NameState *State) {
-    trySubstitute("C1", "C2");
-    trySubstitute("D1", "D2");
-    return ManglingSubstitutor::parseCtorDtorName(SoFar, State);
-  }
-};
-} // namespace
-
 std::vector<ConstString> CPlusPlusLanguage::GenerateAlternateFunctionManglings(
     const ConstString mangled_name) const {
   std::vector<ConstString> alternates;
@@ -751,29 +631,49 @@ std::vector<ConstString> CPlusPlusLanguage::GenerateAlternateFunctionManglings(
     alternates.push_back(ConstString(fixed_scratch));
   }
 
-  TypeSubstitutor TS;
+  auto *log = GetLog(LLDBLog::Language);
+
   // `char` is implementation defined as either `signed` or `unsigned`.  As a
   // result a char parameter has 3 possible manglings: 'c'-char, 'a'-signed
   // char, 'h'-unsigned char.  If we're looking for symbols with a signed char
   // parameter, try finding matches which have the general case 'c'.
-  if (ConstString char_fixup =
-          TS.substitute(mangled_name.GetStringRef(), "a", "c"))
-    alternates.push_back(char_fixup);
+  if (auto char_fixup_or_err =
+          SubstituteType_ItaniumMangle(mangled_name.GetStringRef(), "a", "c")) {
+    // LLDB_LOG(log, "Substituted mangling {0} -> {1}", Mangled, Result);
+    if (*char_fixup_or_err)
+      alternates.push_back(*char_fixup_or_err);
+  } else
+    LLDB_LOG_ERROR(log, char_fixup_or_err.takeError(),
+                   "Failed to substitute 'char' type mangling: {0}");
 
   // long long parameter mangling 'x', may actually just be a long 'l' argument
-  if (ConstString long_fixup =
-          TS.substitute(mangled_name.GetStringRef(), "x", "l"))
-    alternates.push_back(long_fixup);
+  if (auto long_fixup_or_err =
+          SubstituteType_ItaniumMangle(mangled_name.GetStringRef(), "x", "l")) {
+    if (*long_fixup_or_err)
+      alternates.push_back(*long_fixup_or_err);
+  } else
+    LLDB_LOG_ERROR(log, long_fixup_or_err.takeError(),
+                   "Failed to substitute 'long long' type mangling: {0}");
 
   // unsigned long long parameter mangling 'y', may actually just be unsigned
   // long 'm' argument
-  if (ConstString ulong_fixup =
-          TS.substitute(mangled_name.GetStringRef(), "y", "m"))
-    alternates.push_back(ulong_fixup);
-
-  if (ConstString ctor_fixup =
-          CtorDtorSubstitutor().substitute(mangled_name.GetStringRef()))
-    alternates.push_back(ctor_fixup);
+  if (auto ulong_fixup_or_err =
+          SubstituteType_ItaniumMangle(mangled_name.GetStringRef(), "y", "m")) {
+    if (*ulong_fixup_or_err)
+      alternates.push_back(*ulong_fixup_or_err);
+  } else
+    LLDB_LOG_ERROR(
+        log, ulong_fixup_or_err.takeError(),
+        "Failed to substitute 'unsigned long long' type mangling: {0}");
+
+  if (auto ctor_fixup_or_err = SubstituteStructorAliases_ItaniumMangle(
+          mangled_name.GetStringRef())) {
+    if (*ctor_fixup_or_err) {
+      alternates.push_back(*ctor_fixup_or_err);
+    }
+  } else
+    LLDB_LOG_ERROR(log, ctor_fixup_or_err.takeError(),
+                   "Failed to substitute structor alias manglings: {0}");
 
   return alternates;
 }
@@ -2442,6 +2342,160 @@ bool CPlusPlusLanguage::HandleFrameFormatVariable(
   }
 }
 
+namespace {
+class NodeAllocator {
+  llvm::BumpPtrAllocator Alloc;
+
+public:
+  void reset() { Alloc.Reset(); }
+
+  template <typename T, typename... Args> T *makeNode(Args &&...args) {
+    return new (Alloc.Allocate(sizeof(T), alignof(T)))
+        T(std::forward<Args>(args)...);
+  }
+
+  void *allocateNodeArray(size_t sz) {
+    return Alloc.Allocate(sizeof(llvm::itanium_demangle::Node *) * sz,
+                          alignof(llvm::itanium_demangle::Node *));
+  }
+};
+
+template <typename Derived>
+class ManglingSubstitutor
+    : public llvm::itanium_demangle::AbstractManglingParser<Derived,
+                                                            NodeAllocator> {
+  using Base =
+      llvm::itanium_demangle::AbstractManglingParser<Derived, NodeAllocator>;
+
+public:
+  ManglingSubstitutor() : Base(nullptr, nullptr) {}
+
+  template <typename... Ts>
+  llvm::Expected<ConstString> substitute(llvm::StringRef Mangled,
+                                         Ts &&...Vals) {
+    this->getDerived().reset(Mangled, std::forward<Ts>(Vals)...);
+    return substituteImpl(Mangled);
+  }
+
+protected:
+  void reset(llvm::StringRef Mangled) {
+    Base::reset(Mangled.begin(), Mangled.end());
+    Written = Mangled.begin();
+    Result.clear();
+    Substituted = false;
+  }
+
+  llvm::Expected<ConstString> substituteImpl(llvm::StringRef Mangled) {
+    if (this->parse() == nullptr)
+      return llvm::createStringError(
+          llvm::formatv("Failed to substitute mangling in '{0}'", Mangled));
+
+    if (!Substituted)
+      return ConstString();
+
+    // Append any trailing unmodified input.
+    appendUnchangedInput();
+    return ConstString(Result);
+  }
+
+  void trySubstitute(llvm::StringRef From, llvm::StringRef To) {
+    if (!llvm::StringRef(currentParserPos(), this->numLeft()).starts_with(From))
+      return;
+
+    // We found a match. Append unmodified input up to this point.
+    appendUnchangedInput();
+
+    // And then perform the replacement.
+    Result += To;
+    Written += From.size();
+    Substituted = true;
+  }
+
+private:
+  /// Input character until which we have constructed the respective output
+  /// already.
+  const char *Written = "";
+
+  llvm::SmallString<128> Result;
+
+  /// Whether we have performed any substitutions.
+  bool Substituted = false;
+
+  const char *currentParserPos() const { return this->First; }
+
+  void appendUnchangedInput() {
+    Result +=
+        llvm::StringRef(Written, std::distance(Written, currentParserPos()));
+    Written = currentParserPos();
+  }
+};
+
+/// Given a mangled function `Mangled`, replace all the primitive function type
+/// arguments of `Search` with type `Replace`.
+class TypeSubstitutor : public ManglingSubstitutor<TypeSubstitutor> {
+  llvm::StringRef Search;
+  llvm::StringRef Replace;
+
+public:
+  void reset(llvm::StringRef Mangled, llvm::StringRef Search,
+             llvm::StringRef Replace) {
+    ManglingSubstitutor::reset(Mangled);
+    this->Search = Search;
+    this->Replace = Replace;
+  }
+
+  llvm::itanium_demangle::Node *parseType() {
+    trySubstitute(Search, Replace);
+    return ManglingSubstitutor::parseType();
+  }
+};
+
+class CtorDtorSubstitutor : public ManglingSubstitutor<CtorDtorSubstitutor> {
+  llvm::StringRef Search;
+  llvm::StringRef Replace;
+
+public:
+  void reset(llvm::StringRef Mangled, llvm::StringRef Search,
+             llvm::StringRef Replace) {
+    ManglingSubstitutor::reset(Mangled);
+    this->Search = Search;
+    this->Replace = Replace;
+  }
+
+  void reset(llvm::StringRef Mangled) { ManglingSubstitutor::reset(Mangled); }
+
+  llvm::itanium_demangle::Node *
+  parseCtorDtorName(llvm::itanium_demangle::Node *&SoFar, NameState *State) {
+    if (!Search.empty() && !Replace.empty()) {
+      trySubstitute(Search, Replace);
+    } else {
+      trySubstitute("D1", "D2");
+      trySubstitute("C1", "C2");
+    }
+    return ManglingSubstitutor::parseCtorDtorName(SoFar, State);
+  }
+};
+} // namespace
+
+llvm::Expected<ConstString>
+CPlusPlusLanguage::SubstituteType_ItaniumMangle(llvm::StringRef mangled_name,
+                                                llvm::StringRef subst_from,
+                                                llvm::StringRef subst_to) {
+  return TypeSubstitutor().substitute(mangled_name, subst_from, subst_to);
+}
+
+llvm::Expected<ConstString> CPlusPlusLanguage::SubstituteStructor_ItaniumMangle(
+    llvm::StringRef mangled_name, llvm::StringRef subst_from,
+    llvm::StringRef subst_to) {
+  return CtorDtorSubstitutor().substitute(mangled_name, subst_from, subst_to);
+}
+
+llvm::Expected<ConstString>
+CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+    llvm::StringRef mangled_name) {
+  return CtorDtorSubstitutor().substitute(mangled_name);
+}
+
 #define LLDB_PROPERTIES_language_cplusplus
 #include "LanguageCPlusPlusProperties.inc"
 
diff --git a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h
index 4a30299dd2658..67b28bf202c3a 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h
+++ b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.h
@@ -164,6 +164,19 @@ class CPlusPlusLanguage : public Language {
   ConstString FindBestAlternateFunctionMangledName(
       const Mangled mangled, const SymbolContext &sym_ctx) const override;
 
+  static llvm::Expected<ConstString>
+  SubstituteType_ItaniumMangle(llvm::StringRef mangled_name,
+                               llvm::StringRef subst_from,
+                               llvm::StringRef subst_to);
+
+  static llvm::Expected<ConstString>
+  SubstituteStructor_ItaniumMangle(llvm::StringRef mangled_name,
+                                   llvm::StringRef subst_from,
+                                   llvm::StringRef subst_to);
+
+  static llvm::Expected<ConstString>
+  SubstituteStructorAliases_ItaniumMangle(llvm::StringRef mangled_name);
+
   llvm::StringRef GetInstanceVariableName() override { return "this"; }
 
   FormatEntity::Entry GetFunctionNameFormat() const override;
diff --git a/lldb/unittests/Language/CPlusPlus/CMakeLists.txt b/lldb/unittests/Language/CPlusPlus/CMakeLists.txt
index 4882eafc8d854..1d96fcf3db1b8 100644
--- a/lldb/unittests/Language/CPlusPlus/CMakeLists.txt
+++ b/lldb/unittests/Language/CPlusPlus/CMakeLists.txt
@@ -3,4 +3,5 @@ add_lldb_unittest(LanguageCPlusPlusTests
 
   LINK_LIBS
     lldbPluginCPlusPlusLanguage
+    LLVMTestingSupport
   )
diff --git a/lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp b/lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
index 957fb3f600499..92d49eb0f93ff 100644
--- a/lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
+++ b/lldb/unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
@@ -9,6 +9,8 @@
 #include "Plugins/Language/CPlusPlus/CPlusPlusNameParser.h"
 #include "TestingSupport/SubsystemRAII.h"
 #include "lldb/lldb-enumerations.h"
+#include "llvm/Support/Error.h"
+#include "llvm/Testing/Support/Error.h"
 #include "gmock/gmock.h"
 #include "gtest/gtest.h"
 #include <optional>
@@ -427,3 +429,159 @@ TEST(CPlusPlusLanguage, MatchesCxx) {
   Mangled msvcSymbol("??x@@3AH");
   EXPECT_TRUE(CPlusPlusLang->SymbolNameFitsToLanguage(msvcSymbol));
 }
+
+struct ManglingSubstitutorTestCase {
+  llvm::StringRef mangled;
+  llvm::StringRef from;
+  llvm::StringRef to;
+  llvm::StringRef expected;
+  bool expect_error;
+};
+
+struct ManglingSubstitutorTestFixture
+    : public ::testing::TestWithParam<ManglingSubstitutorTestCase> {};
+
+ManglingSubstitutorTestCase g_mangled_substitutor_type_test_cases[] = {
+    {/*.mangled*/ "_Z3fooa", /*from*/ "a", /*to*/ "c", /*expected*/ "_Z3fooc",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3fooy", /*from*/ "y", /*to*/ "m", /*expected*/ "_Z3foom",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3foox", /*from*/ "x", /*to*/ "l", /*expected*/ "_Z3fool",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3baraa", /*from*/ "a", /*to*/ "c", /*expected*/ "_Z3barcc",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3foov", /*from*/ "x", /*to*/ "l", /*expected*/ "",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3fooB3Tagv", /*from*/ "Tag", /*to*/ "random",
+     /*expected*/ "", /*expect_error*/ false},
+    {/*.mangled*/ "_Z3foocc", /*from*/ "a", /*to*/ "c", /*expected*/ "",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3fooIaE3barIaEEvaT_", /*from*/ "a", /*to*/ "c",
+     /*expected*/ "_ZN3fooIcE3barIcEEvcT_", /*expect_error*/ false},
+    {/*.mangled*/ "foo", /*from*/ "x", /*to*/ "l", /*expected*/ "",
+     /*expect_error*/ true},
+    {/*.mangled*/ "", /*from*/ "x", /*to*/ "l", /*expected*/ "",
+     /*expect_error*/ true},
+    // FIXME: these two cases are odd behaviours, though not realistic in
+    // practice.
+    {/*.mangled*/ "_Z3foox", /*from*/ "", /*to*/ "l", /*expected*/ "_Z3foolx",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_Z3foox", /*from*/ "x", /*to*/ "", /*expected*/ "_Z3foo",
+     /*expect_error*/ false}};
+
+TEST_P(ManglingSubstitutorTestFixture, Type) {
+  // Tests the CPlusPlusLanguage::SubstituteType_ItaniumMangle API.
+
+  const auto &[mangled, from, to, expected, expect_error] = GetParam();
+
+  auto subst_or_err =
+      CPlusPlusLanguage::SubstituteType_ItaniumMangle(mangled, from, to);
+  if (expect_error) {
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Failed());
+  } else {
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_EQ(*subst_or_err, expected);
+  }
+}
+
+INSTANTIATE_TEST_SUITE_P(
+    ManglingSubstitutorTypeTests, ManglingSubstitutorTestFixture,
+    ::testing::ValuesIn(g_mangled_substitutor_type_test_cases));
+
+struct ManglingSubstitutorStructorTestFixture
+    : public ::testing::TestWithParam<ManglingSubstitutorTestCase> {};
+
+ManglingSubstitutorTestCase g_mangled_substitutor_structor_test_cases[] = {
+    {/*.mangled*/ "_ZN3FooC1Ev", /*from*/ "C1", /*to*/ "C2",
+     /*expected*/ "_ZN3FooC2Ev", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooC4Ev", /*from*/ "C4", /*to*/ "C2",
+     /*expected*/ "_ZN3FooC2Ev", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooC2Ev", /*from*/ "C1", /*to*/ "C2", /*expected*/ "",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooD1Ev", /*from*/ "D1", /*to*/ "D2",
+     /*expected*/ "_ZN3FooD2Ev", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooD2Ev", /*from*/ "D1", /*to*/ "D2", /*expected*/ "",
+     /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooD4Ev", /*from*/ "D4", /*to*/ "D2",
+     /*expected*/ "_ZN3FooD2Ev", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN2D12C1C1I2C12D1EE2C12D1", /*from*/ "C1", /*to*/ "C2",
+     /*expected*/ "_ZN2D12C1C2I2C12D1EE2C12D1", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN2D12C1D1I2C12D1EE2C12D1", /*from*/ "D1", /*to*/ "D2",
+     /*expected*/ "_ZN2D12C1D2I2C12D1EE2C12D1", /*expect_error*/ false},
+    {/*.mangled*/ "_ZN3FooC6Ev", /*from*/ "D1", /*to*/ "D2", /*expected*/ "",
+     /*expect_error*/ true},
+};
+
+TEST_P(ManglingSubstitutorStructorTestFixture, Structors) {
+  // Tests the CPlusPlusLanguage::SubstituteStructor_ItaniumMangle API.
+
+  const auto &[mangled, from, to, expected, expect_error] = GetParam();
+
+  auto subst_or_err =
+      CPlusPlusLanguage::SubstituteStructor_ItaniumMangle(mangled, from, to);
+  if (expect_error) {
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Failed());
+  } else {
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_EQ(*subst_or_err, expected);
+  }
+}
+
+INSTANTIATE_TEST_SUITE_P(
+    ManglingSubstitutorStructorTests, ManglingSubstitutorStructorTestFixture,
+    ::testing::ValuesIn(g_mangled_substitutor_structor_test_cases));
+
+TEST(CPlusPlusLanguage, ManglingSubstitutor_StructorAlias) {
+  // Tests the CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle API.
+  {
+    // Invalid mangling.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle("Foo");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Failed());
+  }
+
+  {
+    // Ctor C1 alias.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+            "_ZN3FooC1Ev");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_EQ(*subst_or_err, "_ZN3FooC2Ev");
+  }
+
+  {
+    // Dtor D1 alias.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+            "_ZN3FooD1Ev");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_EQ(*subst_or_err, "_ZN3FooD2Ev");
+  }
+
+  {
+    // Ctor C2 not aliased.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+            "_ZN3FooC2Ev");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_FALSE(*subst_or_err);
+  }
+
+  {
+    // Dtor D2 not aliased.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+            "_ZN3FooD2Ev");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_FALSE(*subst_or_err);
+  }
+
+  {
+    // Check that variants in other parts of the name don't get replaced.
+    auto subst_or_err =
+        CPlusPlusLanguage::SubstituteStructorAliases_ItaniumMangle(
+            "_ZN2D12C1C1I2C12D1EE2C12D1");
+    EXPECT_THAT_EXPECTED(subst_or_err, llvm::Succeeded());
+    EXPECT_EQ(*subst_or_err, "_ZN2D12C1C2I2C12D1EE2C12D1");
+  }
+}

@Michael137 Michael137 enabled auto-merge (squash) August 27, 2025 12:42
@Michael137 Michael137 merged commit 9dd38b0 into llvm:main Aug 27, 2025
9 checks passed
@Michael137 Michael137 deleted the lldb/mangled-substitutor branch August 27, 2025 13:55
Michael137 added a commit that referenced this pull request Sep 9, 2025
#149827)

Depends on
* #148877
* #155483
* #155485
* #154137
* #154142

This patch is an implementation of [this
discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s
on function and method Clang AST nodes. This means that when calls to
these functions get lowered into IR (when running JITted expressions),
the address resolver can locate the appropriate symbol by mangled name
(and it is guaranteed to find the symbol because we got the mangled name
from debug-info, instead of letting Clang mangle it based on AST
structure). However, we don't do this for
`CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor
declarations in DWARF don't have a linkage name. This is because there
can be multiple variants of a structor, each with a distinct mangling in
the Itanium ABI. Each structor variant has its own definition
`DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into
the `AsmLabel`.

Currently this means using ABI-tagged structors in LLDB expressions
won't work (see [this
RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already
supports stuffing more info about a DIE into it. So this patch extends
the `FunctionCallLabel` to contain an optional discriminator (a sequence
of bytes) which the `SymbolFileDWARF` plugin interprets as the
constructor/destructor variant of that DIE. So when searching for the
definition DIE, LLDB will include the structor variant in its heuristic
for determining a match.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way
of knowing (just by looking at the debug-info declaration), which
structor variant the expression evaluator is supposed to call. That's
something that gets decided when compiling the expression. So we let the
Clang mangler inject the correct structor variant into the `AsmLabel`
during JITing. I adjusted the `AsmLabelAttr` mangling for this in
#155485. An option would've
been to create a new Clang attribute which behaved like an `AsmLabel`
but with these special semantics for LLDB. My main concern there is that
we'd have to adjust all the `AsmLabelAttr` checks around Clang to also
now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the
`C2` variant is sufficient. In that case it may alias `C1` to `C2`,
leaving us with only the `C2` `DW_TAG_subprogram` in the object file.
Linux is one of the platforms where this occurs. For those cases I added
a heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1`
but it doesn't exist. This may not always be correct (e.g., if the
compiler decided to drop `C1` for other reasons).
3. In #154142 Clang will emit
`C4`/`D4` variants of ctors/dtors on declarations. When resolving the
`FunctionCallLabel` we will now substitute the actual variant that Clang
told us we need to call into the mangled name. We do this using LLDB's
`ManglingSubstitutor`. That way we find the definition DIE exactly the
same way we do for regular function calls.
4. In cases where declarations and definitions live in separate modules,
the DIE ID encoded in the function call label may not be enough to find
the definition DIE in the encoded module ID. For those cases we fall
back to how LLDB used to work: look up in all images of the target. To
make sure we don't use the unified mangled name for the fallback lookup,
we change the lookup name to whatever mangled name the FunctionCallLabel
resolved to.

rdar://104968288
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Sep 9, 2025
… call labels (#149827)

Depends on
* llvm/llvm-project#148877
* llvm/llvm-project#155483
* llvm/llvm-project#155485
* llvm/llvm-project#154137
* llvm/llvm-project#154142

This patch is an implementation of [this
discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s
on function and method Clang AST nodes. This means that when calls to
these functions get lowered into IR (when running JITted expressions),
the address resolver can locate the appropriate symbol by mangled name
(and it is guaranteed to find the symbol because we got the mangled name
from debug-info, instead of letting Clang mangle it based on AST
structure). However, we don't do this for
`CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor
declarations in DWARF don't have a linkage name. This is because there
can be multiple variants of a structor, each with a distinct mangling in
the Itanium ABI. Each structor variant has its own definition
`DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into
the `AsmLabel`.

Currently this means using ABI-tagged structors in LLDB expressions
won't work (see [this
RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already
supports stuffing more info about a DIE into it. So this patch extends
the `FunctionCallLabel` to contain an optional discriminator (a sequence
of bytes) which the `SymbolFileDWARF` plugin interprets as the
constructor/destructor variant of that DIE. So when searching for the
definition DIE, LLDB will include the structor variant in its heuristic
for determining a match.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way
of knowing (just by looking at the debug-info declaration), which
structor variant the expression evaluator is supposed to call. That's
something that gets decided when compiling the expression. So we let the
Clang mangler inject the correct structor variant into the `AsmLabel`
during JITing. I adjusted the `AsmLabelAttr` mangling for this in
llvm/llvm-project#155485. An option would've
been to create a new Clang attribute which behaved like an `AsmLabel`
but with these special semantics for LLDB. My main concern there is that
we'd have to adjust all the `AsmLabelAttr` checks around Clang to also
now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the
`C2` variant is sufficient. In that case it may alias `C1` to `C2`,
leaving us with only the `C2` `DW_TAG_subprogram` in the object file.
Linux is one of the platforms where this occurs. For those cases I added
a heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1`
but it doesn't exist. This may not always be correct (e.g., if the
compiler decided to drop `C1` for other reasons).
3. In llvm/llvm-project#154142 Clang will emit
`C4`/`D4` variants of ctors/dtors on declarations. When resolving the
`FunctionCallLabel` we will now substitute the actual variant that Clang
told us we need to call into the mangled name. We do this using LLDB's
`ManglingSubstitutor`. That way we find the definition DIE exactly the
same way we do for regular function calls.
4. In cases where declarations and definitions live in separate modules,
the DIE ID encoded in the function call label may not be enough to find
the definition DIE in the encoded module ID. For those cases we fall
back to how LLDB used to work: look up in all images of the target. To
make sure we don't use the unified mangled name for the fallback lookup,
we change the lookup name to whatever mangled name the FunctionCallLabel
resolved to.

rdar://104968288
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
llvm#155483)

Part of llvm#149827

Allows us to use the mangling substitution facilities in
CPlusPlusLanguage but also SymbolFileDWARF.

Added tests now that they're "public".

(cherry picked from commit 9dd38b0)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 9, 2025
llvm#149827)

Depends on
* llvm#148877
* llvm#155483
* llvm#155485
* llvm#154137
* llvm#154142

This patch is an implementation of [this
discussion](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816/7)
about handling ABI-tagged structors during expression evaluation.

**Motivation**

LLDB encodes the mangled name of a `DW_TAG_subprogram` into `AsmLabel`s
on function and method Clang AST nodes. This means that when calls to
these functions get lowered into IR (when running JITted expressions),
the address resolver can locate the appropriate symbol by mangled name
(and it is guaranteed to find the symbol because we got the mangled name
from debug-info, instead of letting Clang mangle it based on AST
structure). However, we don't do this for
`CXXConstructorDecl`s/`CXXDestructorDecl`s because these structor
declarations in DWARF don't have a linkage name. This is because there
can be multiple variants of a structor, each with a distinct mangling in
the Itanium ABI. Each structor variant has its own definition
`DW_TAG_subprogram`. So LLDB doesn't know which mangled name to put into
the `AsmLabel`.

Currently this means using ABI-tagged structors in LLDB expressions
won't work (see [this
RFC](https://discourse.llvm.org/t/rfc-lldb-handling-abi-tagged-constructors-destructors-in-expression-evaluator/82816)
for concrete examples).

**Proposed Solution**

The `FunctionCallLabel` encoding that we put into `AsmLabel`s already
supports stuffing more info about a DIE into it. So this patch extends
the `FunctionCallLabel` to contain an optional discriminator (a sequence
of bytes) which the `SymbolFileDWARF` plugin interprets as the
constructor/destructor variant of that DIE. So when searching for the
definition DIE, LLDB will include the structor variant in its heuristic
for determining a match.

There's a few subtleties here:
1. At the point at which LLDB first constructs the label, it has no way
of knowing (just by looking at the debug-info declaration), which
structor variant the expression evaluator is supposed to call. That's
something that gets decided when compiling the expression. So we let the
Clang mangler inject the correct structor variant into the `AsmLabel`
during JITing. I adjusted the `AsmLabelAttr` mangling for this in
llvm#155485. An option would've
been to create a new Clang attribute which behaved like an `AsmLabel`
but with these special semantics for LLDB. My main concern there is that
we'd have to adjust all the `AsmLabelAttr` checks around Clang to also
now account for this new attribute.
2. The compiler is free to omit the `C1` variant of a constructor if the
`C2` variant is sufficient. In that case it may alias `C1` to `C2`,
leaving us with only the `C2` `DW_TAG_subprogram` in the object file.
Linux is one of the platforms where this occurs. For those cases I added
a heuristic in `SymbolFileDWARF` where we pick `C2` if we asked for `C1`
but it doesn't exist. This may not always be correct (e.g., if the
compiler decided to drop `C1` for other reasons).
3. In llvm#154142 Clang will emit
`C4`/`D4` variants of ctors/dtors on declarations. When resolving the
`FunctionCallLabel` we will now substitute the actual variant that Clang
told us we need to call into the mangled name. We do this using LLDB's
`ManglingSubstitutor`. That way we find the definition DIE exactly the
same way we do for regular function calls.
4. In cases where declarations and definitions live in separate modules,
the DIE ID encoded in the function call label may not be enough to find
the definition DIE in the encoded module ID. For those cases we fall
back to how LLDB used to work: look up in all images of the target. To
make sure we don't use the unified mangled name for the fallback lookup,
we change the lookup name to whatever mangled name the FunctionCallLabel
resolved to.

rdar://104968288
(cherry picked from commit 57a7907)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants