-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Fix smart_holder multiple/virtual inheritance bugs in shared_ptr and unique_ptr to-Python conversions
#5836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:44:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
44 | virtual ~Left() = default;
| ~~~~~~~ ^
| override
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:48:13: error: prefer using 'override' or (rarely) 'final' instead of 'virtual' [modernize-use-override,-warnings-as-errors]
48 | virtual ~Right() = default;
| ~~~~~~~ ^
| override
```
|
CI results @ commit 5f77da3: 66 failing, 2 skipped, 17 successful checks CI: https://github.com/pybind/pybind11/actions/runs/17707681525?pr=5836 |
|
CI results @ commit 3620ceb: 20 failing, 2 skipped, 63 successful checks CI: https://github.com/pybind/pybind11/actions/runs/17713020523?pr=5836 |
ChatGPT: * shared_ptr’s ctor can throw (control-block alloc). Using get() keeps unique_ptr owning the memory if that happens, so no leak. * Only after the shared_ptr is successfully constructed do you release(), transferring ownership exactly once.
…ampoline code (which often uses the term "alias", too)
…ithin test_class_sh_mi_thunks.cpp
```
/__w/pybind11/pybind11/tests/test_class_sh_mi_thunks.cpp:67:5: error: 'auto ptr' can be declared as 'auto *ptr' [readability-qualified-auto,-warnings-as-errors]
67 | auto ptr = new Diamond;
| ^~~~
| auto *
```
smart_holder multiple/virtual inheritance bugs in shared_ptr and unique_ptr to-Python conversions
|
@henryiii This PR is ready for review. @MannixYang If you get a chance to try this out, please report back. |
|
@rwgk Thank you very much indeed. Yes, I will give it a try. It should be within the next two days. |
|
@rwgk Great! This is very useful. My project can now run. Thank you again. |
Thanks a lot for confirming! |
iwanders
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @iwanders, is there a chance that you could help reviewing this PR?
👍 definitely, I did go through it and made two minor comments. The code around the handling of deleters looks good to me.
| // Relinquish ownership only after successful construction of owner | ||
| (void) unq_ptr.release(); | ||
|
|
||
| // Publish either the subobject alias (for identity/VI) or the full object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I'd elaborate a bit in this comment. This part feels a bit like magic to me why this is necessary and fixes a bug when this is used in msvc. I think I understand the problem to be that the shared pointer 'owner' must always point to the start of the entire object (and the deleter operates on that), but that the actual smart holder should 'hold' the subclass, which may not be at the start of the object itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: commit 31886ec
tests/test_class_sh_mi_thunks.cpp
Outdated
| int vbase_tag = 42; // ensure it's not empty | ||
| }; | ||
|
|
||
| // Left/right add some weight to steer layout differences across compilers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this matters, but Left and Right are pretty similar a long long is guaranteed to be 64 bits, and the 7 long array is likely going to be padded to 8, which would make them the same size? To me the comment contradicts the code a bit in that sense. Maybe just having any attribute is enough, but if the sizes ought to be different I'd make it very clear, say char pad_l[4] and char pad_r[16].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done: commit ba89e1d
Here is what ChatGPT is saying:
Good catch — the original comment was misleading, since char[7] and long long can end up the same size after padding. The real intent is simply to make the virtual bases non-empty and, ideally, asymmetrically sized/aligned so compilers are more likely to give us non-zero subobject offsets. I’ve updated the code to use small but clearly asymmetric paddings (char[4] vs char[16], with optional alignment) and clarified the comment to reflect this. The test still works correctly even if a compiler chooses offset 0, since that case is explicitly logged via test_virtual_base_at_offset_0().
In Ralf's own words: I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows [see comment from two weeks ago] and didn't look as closely at the test code as you did. However, now I pushed ChatGPT harder, to "really bend things out of shape", but it assured me that the new code is as good as we can make it.
I'll rerun the CI. When it's done I'll look for test_virtual_base_at_offset_0() skip messages again. (I think I did that before, although I forgot to log that here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows
Agreed, that is pretty impressive and definitely speeds things up. Thanks for sharing your ChatGPT conversation btw, I’ve bookmarked that to read through when I’m not on the road. I don’t have much experience using llms, so expect it’ll be educational for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just super happy when the GhatGPT-generated test triggered failures on both Linux and Windows
Agreed, that is pretty impressive and definitely speeds things up. Thanks for sharing your ChatGPT conversation btw, I’ve bookmarked that to read through when I’m not on the road. I don’t have much experience using llms, so expect it’ll be educational for me.
My time budget for pybind11 maintenance work has become super tiny. I think without ChatGPT I'd have to give up.
Thanks for your review, this was super helpful.
A couple minutes ago I replaced the pytest.skip() with an assert, to guarantee test quality in the future. It'd be great if you could take a final look and formally approve if it looks good to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple minutes ago I replaced the pytest.skip() with an assert, to guarantee test quality in the future.
Good idea, and helpful to have that comment there for future developers that may run into the assert failing. I've approved it!
```
"D:\a\pybind11\pybind11\build\ALL_BUILD.vcxproj" (default target) (1) ->
"D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj" (default target) (13) ->
(ClCompile target) ->
D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(70,17): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
D:\a\pybind11\pybind11\tests\test_class_sh_mi_thunks.cpp(80,43): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: 'std::_Ref_count_obj2<_Ty>': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: with [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: [ [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: _Ty=test_class_sh_mi_thunks::Diamond [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\memory(2913,46): warning C4316: ] [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
D:\a\pybind11\pybind11\include\pybind11\detail\init.h(77,21): warning C4316: 'test_class_sh_mi_thunks::Diamond': object allocated on the heap may not be aligned 16 [D:\a\pybind11\pybind11\build\tests\pybind11_tests.vcxproj]
```
The warning came from alignas(16) making Diamond over-aligned, while regular new/make_shared aren’t guaranteed to return 16-byte aligned memory on MSVC (hence C4316). I’ve removed the explicit alignment and switched to asymmetric payload sizes (char[4] vs char[24]), which still nudges MI layout without relying on over-alignment. This keeps the test goal and eliminates the warning across all MSVC builds. If we ever want to stress over-alignment explicitly, we can add aligned operator new/delete under __cpp_aligned_new, but that’s more than we need here.
Checking for
|
…set_0() and replace pytest.skip() with assert. Add helpful comment for future maintainers.
|
Thanks a lot @iwanders! |
Description
Closes #5786.
This PR fixes two independent bugs in the
smart_holdermachinery, both exposed by new tests involving virtual inheritance.1. Fix in
pybind11/detail/type_caster_base.hThe return-value caster for
std::shared_ptr<T>was incorrectly usingsrc.get()(the most-derived pointer) instead ofst.first(the adjusted subobject pointer appropriate for the registeredtinfo).On MSVC with virtual inheritance, this mismatch led to passing an
Animal*(see below) toregister_instancewhen bindingTiger, which caused the virtual-base offset walker to dereference a bogus vbptr.This is the bug reported in #5786. Using
st.firstensures the correct subobject is registered.2. Fix in
pybind11/detail/struct_smart_holder.hsmart_holder::from_unique_ptrwas storing the subobject pointer as the owned resource in its control block. Under MSVC, destroying through that misaligned pointer caused NX faults in the virtual destructor path.The fix mirrors the
shared_ptrpath: always own the realT*object start with the deleter, and, if needed, create an aliasingshared_ptr<void>at the subobject pointer for registration/identity.Tests
test_class_sh_mi_thunks) was added. This was designed to also fail on Linux/Clang/GCC without the fix (not just on MSVC). Indeed, the diamond case fails across allci.ymljobs and severaltests-cibw.ymljobs pre-fix, and passes after fix (see comment).smart_holder_from_unique_ptr, which immediately exposed the second bug. That test produced 20 MSVC failures pre-fix (comment), and passes after the fix.test_virtual_base_at_offset_0is included to make it explicit when a compiler/layout places the virtual base at offset 0. This isn’t a true skip, but a way to document in the pytest summary when the test can’t exercise the MI/VI code path. In practice, this skip has never been triggered in any GitHub Actions job.asserts remain inpybind11/detail/class.h. They were invaluable during debugging and experimentation, have no impact on non-debug builds, and may be useful again if MI/VI edge cases resurface.Context
The changes here are in the same spirit as PR #4380, which fixed earlier oversights in
smart_holderMI handling. As noted there, MI use cases are uncommon in downstream projects (and even discouraged in some organizations), so coverage has historically been thin. The new tests give us much better confidence that bothshared_ptrandunique_ptradoption paths are now robust under virtual and multiple inheritance.Full ChatGPT conversation guiding the work on the fixes (very long and involved): https://chatgpt.com/share/68c7950a-61b8-8008-8678-53b96643ce7e
A lot of the credit for the fixes goes to ChatGPT 5 Pro
Suggested changelog entry:
Fixed two
smart_holderbugs inshared_ptrandunique_ptradoption with multiple/virtual inheritance:shared_ptrto-Python caster now registers the correct subobject pointer (fixes [BUG]: Using smart_holder in virtual inheritance relationship error #5786).unique_ptradoption now owns the proper object start while aliasing subobject pointers for registration, fixing MSVC crashes during destruction.