Skip to content

Conversation

Michael137
Copy link
Member

This came up in #155691.

For std::basic_string our formatter matching logic required the allocator template parameter to be a std::allocator. There is no compelling reason (that I know of) why this would be required for us to apply the existing formatter to the string. We don't check the allocator parameter for other STL containers either. This meant that std::string that used custom allocators wouldn't be formatted. This patch relaxes the regex for basic_string.

…ustom allocators

This came up in llvm#155691.

For `std::basic_string` our formatter matching logic required the
allocator template parameter to be a `std::allocator`. There is no
compelling reason (that I know of) why this would be required for us to
apply the existing formatter to the string. We don't check the
`allocator` parameter for other STL containers either. This meant that
`std::string` that used custom allocators wouldn't be formatted. This
patch relaxes the regex for `basic_string`.
@Michael137 Michael137 force-pushed the lldb/custom-string-alloc-formats branch from 88ec358 to af0f77f Compare September 1, 2025 12:23
Copy link

github-actions bot commented Sep 4, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Michael137 Michael137 force-pushed the lldb/custom-string-alloc-formats branch from ca0d7c8 to a509c8c Compare September 4, 2025 14:56
@Michael137 Michael137 marked this pull request as ready for review September 5, 2025 06:39
@llvmbot llvmbot added the lldb label Sep 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

This came up in #155691.

For std::basic_string our formatter matching logic required the allocator template parameter to be a std::allocator. There is no compelling reason (that I know of) why this would be required for us to apply the existing formatter to the string. We don't check the allocator parameter for other STL containers either. This meant that std::string that used custom allocators wouldn't be formatted. This patch relaxes the regex for basic_string.


Full diff: https://github.com/llvm/llvm-project/pull/156050.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp (+17-33)
  • (modified) lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/TestDataFormatterStdString.py (+6)
  • (modified) lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/main.cpp (+35)
diff --git a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
index c39b529f7305a..277de8f444828 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
@@ -749,31 +749,27 @@ static void LoadLibCxxFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {
                 lldb_private::formatters::LibcxxStringSummaryProviderASCII,
                 "std::string summary provider",
                 "^std::__[[:alnum:]]+::basic_string<char, "
-                "std::__[[:alnum:]]+::char_traits<char>, "
-                "std::__[[:alnum:]]+::allocator<char> >$",
+                "std::__[[:alnum:]]+::char_traits<char>,.*>$",
                 stl_summary_flags, true);
   AddCXXSummary(cpp_category_sp,
                 lldb_private::formatters::LibcxxStringSummaryProviderASCII,
                 "std::string summary provider",
                 "^std::__[[:alnum:]]+::basic_string<unsigned char, "
-                "std::__[[:alnum:]]+::char_traits<unsigned char>, "
-                "std::__[[:alnum:]]+::allocator<unsigned char> >$",
+                "std::__[[:alnum:]]+::char_traits<unsigned char>,.*>$",
                 stl_summary_flags, true);
 
   AddCXXSummary(cpp_category_sp,
                 lldb_private::formatters::LibcxxStringSummaryProviderUTF16,
                 "std::u16string summary provider",
                 "^std::__[[:alnum:]]+::basic_string<char16_t, "
-                "std::__[[:alnum:]]+::char_traits<char16_t>, "
-                "std::__[[:alnum:]]+::allocator<char16_t> >$",
+                "std::__[[:alnum:]]+::char_traits<char16_t>,.*>$",
                 stl_summary_flags, true);
 
   AddCXXSummary(cpp_category_sp,
                 lldb_private::formatters::LibcxxStringSummaryProviderUTF32,
                 "std::u32string summary provider",
                 "^std::__[[:alnum:]]+::basic_string<char32_t, "
-                "std::__[[:alnum:]]+::char_traits<char32_t>, "
-                "std::__[[:alnum:]]+::allocator<char32_t> >$",
+                "std::__[[:alnum:]]+::char_traits<char32_t>,.*>$",
                 stl_summary_flags, true);
 
   AddCXXSummary(cpp_category_sp,
@@ -784,8 +780,7 @@ static void LoadLibCxxFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {
                 lldb_private::formatters::LibcxxWStringSummaryProvider,
                 "std::wstring summary provider",
                 "^std::__[[:alnum:]]+::basic_string<wchar_t, "
-                "std::__[[:alnum:]]+::char_traits<wchar_t>, "
-                "std::__[[:alnum:]]+::allocator<wchar_t> >$",
+                "std::__[[:alnum:]]+::char_traits<wchar_t>,.*>$",
                 stl_summary_flags, true);
 
   AddCXXSummary(cpp_category_sp,
@@ -1301,24 +1296,16 @@ static void RegisterStdStringSummaryProvider(
 
   category_sp->AddTypeSummary(makeSpecifier(string_ty), summary_sp);
 
-  // std::basic_string<char>
   category_sp->AddTypeSummary(
       makeSpecifier(llvm::formatv("std::basic_string<{}>", char_ty).str()),
       summary_sp);
-  // std::basic_string<char,std::char_traits<char>,std::allocator<char> >
-  category_sp->AddTypeSummary(
-      makeSpecifier(llvm::formatv("std::basic_string<{0},std::char_traits<{0}>,"
-                                  "std::allocator<{0}> >",
-                                  char_ty)
-                        .str()),
-      summary_sp);
-  // std::basic_string<char, std::char_traits<char>, std::allocator<char> >
+
   category_sp->AddTypeSummary(
-      makeSpecifier(
-          llvm::formatv("std::basic_string<{0}, std::char_traits<{0}>, "
-                        "std::allocator<{0}> >",
+      std::make_shared<lldb_private::TypeNameSpecifierImpl>(
+          llvm::formatv("^std::basic_string<{0}, ?std::char_traits<{0}>,.*>$",
                         char_ty)
-              .str()),
+              .str(),
+          eFormatterMatchRegex),
       summary_sp);
 }
 
@@ -1363,20 +1350,17 @@ static void LoadLibStdcppFormatters(lldb::TypeCategoryImplSP cpp_category_sp) {
   cpp_category_sp->AddTypeSummary("std::__cxx11::string", eFormatterMatchExact,
                                   string_summary_sp);
   cpp_category_sp->AddTypeSummary(
-      "std::__cxx11::basic_string<char, std::char_traits<char>, "
-      "std::allocator<char> >",
-      eFormatterMatchExact, string_summary_sp);
-  cpp_category_sp->AddTypeSummary("std::__cxx11::basic_string<unsigned char, "
-                                  "std::char_traits<unsigned char>, "
-                                  "std::allocator<unsigned char> >",
-                                  eFormatterMatchExact, string_summary_sp);
+      "^std::__cxx11::basic_string<char, std::char_traits<char>,.*>$",
+      eFormatterMatchRegex, string_summary_sp);
+  cpp_category_sp->AddTypeSummary("^std::__cxx11::basic_string<unsigned char, "
+                                  "std::char_traits<unsigned char>,.*>$",
+                                  eFormatterMatchRegex, string_summary_sp);
 
   cpp_category_sp->AddTypeSummary("std::__cxx11::wstring", eFormatterMatchExact,
                                   string_summary_sp);
   cpp_category_sp->AddTypeSummary(
-      "std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, "
-      "std::allocator<wchar_t> >",
-      eFormatterMatchExact, string_summary_sp);
+      "^std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>,.*>$",
+      eFormatterMatchRegex, string_summary_sp);
 
   SyntheticChildren::Flags stl_synth_flags;
   stl_synth_flags.SetCascades(true).SetSkipPointers(false).SetSkipReferences(
diff --git a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/TestDataFormatterStdString.py b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/TestDataFormatterStdString.py
index fec20bae997ef..6a27b5d2f0780 100644
--- a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/TestDataFormatterStdString.py
+++ b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/TestDataFormatterStdString.py
@@ -80,6 +80,8 @@ def cleanup():
                 '(%s::string) Q = "quite a long std::strin with lots of info inside it"'
                 % ns,
                 "(%s::string *) null_str = nullptr" % ns,
+                '(CustomString) custom_str = "hello!"',
+                '(CustomWString) custom_wstr = L"hello!"',
             ],
         )
 
@@ -143,6 +145,10 @@ def do_test_multibyte(self):
                 '(%s::u16string) u16_empty = u""' % ns,
                 '(%s::u32string) u32_string = U"🍄🍅🍆🍌"' % ns,
                 '(%s::u32string) u32_empty = U""' % ns,
+                '(CustomStringU16) custom_u16 = u"ß水氶"',
+                '(CustomStringU16) custom_u16_empty = u""',
+                '(CustomStringU32) custom_u32 = U"🍄🍅🍆🍌"',
+                '(CustomStringU32) custom_u32_empty = U""',
             ],
         )
 
diff --git a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/main.cpp b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/main.cpp
index f22c890861d01..577b78e35fc1f 100644
--- a/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/main.cpp
+++ b/lldb/test/API/functionalities/data-formatter/data-formatter-stl/generic/string/main.cpp
@@ -1,6 +1,33 @@
+#include <cstdlib>
 #include <stdint.h>
 #include <string>
 
+template <typename T> struct CustomAlloc {
+  using value_type = T;
+  using pointer = value_type *;
+  using const_pointer = const value_type *;
+  using size_type = std::size_t;
+
+  pointer allocate(size_type n) { return (T *)malloc(n * sizeof(T)); }
+
+  void deallocate(pointer p, size_type) {
+    if (p)
+      free(p);
+  }
+};
+
+using CustomString =
+    std::basic_string<char, std::char_traits<char>, CustomAlloc<char>>;
+
+using CustomWString =
+    std::basic_string<wchar_t, std::char_traits<wchar_t>, CustomAlloc<wchar_t>>;
+
+using CustomStringU16 = std::basic_string<char16_t, std::char_traits<char16_t>,
+                                          CustomAlloc<char16_t>>;
+
+using CustomStringU32 = std::basic_string<char32_t, std::char_traits<char32_t>,
+                                          CustomAlloc<char32_t>>;
+
 size_t touch_string(std::string &in_str) {
   return in_str.size(); // Break here to look at bad string
 }
@@ -99,8 +126,16 @@ int main() {
   std::string *pq = &q;
   std::string *pQ = &Q;
 
+  CustomString custom_str("hello!");
+  CustomWString custom_wstr(L"hello!");
+  CustomStringU16 custom_u16(u16_string.c_str());
+  CustomStringU16 custom_u16_empty(u"");
+  CustomStringU32 custom_u32(u32_string.c_str());
+  CustomStringU32 custom_u32_empty(U"");
+
   S.assign(L"!!!!!"); // Set break point at this line.
   std::string *not_a_string = (std::string *)0x0;
   touch_string(*not_a_string);
+
   return 0;
 }

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially also drop the char_traits thingy.

@Michael137
Copy link
Member Author

We could potentially also drop the char_traits thingy.

Yea was thinking the same. Might do that as a follow-up. That seems like a more niche use-case, but probably something people do

@Michael137 Michael137 merged commit 4b362f1 into llvm:main Sep 5, 2025
12 checks passed
@Michael137 Michael137 deleted the lldb/custom-string-alloc-formats branch September 5, 2025 08:24
@Michael137 Michael137 added this to the LLVM 21.x Release milestone Sep 5, 2025
@Michael137
Copy link
Member Author

/cherry-pick 4b362f1

@github-project-automation github-project-automation bot moved this to Needs Triage in LLVM Release Status Sep 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

/pull-request #157048

@llvmbot llvmbot moved this from Needs Triage to Done in LLVM Release Status Sep 5, 2025
tru pushed a commit to llvmbot/llvm-project that referenced this pull request Sep 8, 2025
…ustom allocators (llvm#156050)

This came up in llvm#155691.

For `std::basic_string` our formatter matching logic required the
allocator template parameter to be a `std::allocator`. There is no
compelling reason (that I know of) why this would be required for us to
apply the existing formatter to the string. We don't check the
`allocator` parameter for other STL containers either. This meant that
`std::string` that used custom allocators wouldn't be formatted. This
patch relaxes the regex for `basic_string`.

(cherry picked from commit 4b362f1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

3 participants