-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Add -Zannotate-moves for profiler visibility of move/copy operations (codegen) #147803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Some changes occurred in compiler/rustc_codegen_ssa Some changes occurred in compiler/rustc_codegen_gcc |
c72e0b5 to
06baf13
Compare
|
MCP filed rust-lang/compiler-team#928 |
|
Does this support cranelift? The advantage to doing such things in MIR is that it is much easier to support cranelift. |
|
Yeah, that's a good point which I mentioned in the MCP. I had assumed that cranelift and gcc used the same framework as llvm so supporting them would just be a matter of implementing I didn't check whether the MIR transform approach did actually work for cranelift or gcc (does gcc support debuginfo in general?). But if it does, then I wonder if there's a middle ground between MIR and codegen to get the best of both? |
|
Ah, I'll respond on the MCP Zulip thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I recall working on Ruffle years back, and having expensive copies being an issue, something like this would definitely have helped back then.
I did a surface-level review, haven't truly dug into the implementation details of it yet, I'll probably wait 'till the MCP finishes before doing so.
@rustbot blocked (on MCP)
I'm also not 100% confident in my knowledge of things here, it's likely that I'll pass it off to saethlin afterwards to do an extra review on top.
| @@ -0,0 +1,79 @@ | |||
| # `annotate-moves` | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filing as a concern so we don't forget: this will need a tracking issue.
| #[unstable(feature = "profiling_marker_api", issue = "none")] | ||
| pub mod profiling; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have to add something to std/src/lib.rs to get the API re-exported in std as well - that usually documents better. Unless this intentionally isn't done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't really sure what to do about this. These functions are never intended to be called, they aren't really part of any public API, so I didn't think there was really a need to expose them. The only real value in exposing them is in documentation so that if you see core::profiling::compiler_move<T, N> in a backtrace you can look it up and understand what it means. If they only ever appear as core then making them also appear in std could just be confusing.
I guess we could check the overall no-std state(?) and switch between std/core but that didn't really seem useful to me.
Also core::profiling/std::profiling also seemed like it could potentially be very useful namespace for some other profiling-related feature down the line. I was unsure about making it an exposed public module but functionally empty. Should that be deferred until it has some actual callable contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they only ever appear as
corethen making them also appear instdcould just be confusing.
But this is already the state for all debug info, no? If you call a core method that happens to be re-exposed in std, you'd still see core in your debugger.
I was unsure about making it an exposed public module but functionally empty. Should that be deferred until it has some actual callable contents?
Unsure too, in the end, this is probably libs domain (?), I'll ping one of them for guidance once the MCP finishes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is already the state for all debug info, no? If you call a core method that happens to be re-exposed in std, you'd still see core in your debugger.
Sure, but in this case the only place you'd ever see this symbol is in debug info, since you can never call it directly. So having it visible in std::profiling would be at best pointless and redundant, but at worst confusing.
| // Note: On most ABIs, large structs are passed by pointer even when written as "by value", | ||
| // so there may not be an actual memcpy operation to attach compiler_move to. This test | ||
| // mainly verifies that IF debug info is emitted, the call itself uses the source location. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice with a test that does something like:
//@ revisions: by_pointer by_value
// Some target with an ABI that passes by pointer
//@ [by_pointer] compile-flags: --target x86_64-unknown-linux-gnu
//@ [by_pointer] needs-llvm-components: x86_64
// Some target with an ABI that passes by value (dunny one on top of my head)
//@ [by_value] compile-flags: --target abc-vendor-os
//@ [by_value] needs-llvm-components: abcThat would give me more confidence in that we're actually testing the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed this. I didn't try multi-architecture/ABI tests.
|
Updated to address review comments:
|
This comment has been minimized.
This comment has been minimized.
632004b to
fd66773
Compare
|
☔ The latest upstream changes (presumably #147687) made this pull request unmergeable. Please resolve the merge conflicts. |
This implements a new unstable compiler flag `-Zannotate-moves` that makes
move and copy operations visible in profilers by creating synthetic debug
information. This is achieved with zero runtime cost by manipulating debug
info scopes to make moves/copies appear as calls to `compiler_move<T, SIZE>`
and `compiler_copy<T, SIZE>` marker functions in profiling tools.
This allows developers to identify expensive move/copy operations in their
code using standard profiling tools, without requiring specialized tooling
or runtime instrumentation.
The implementation works at codegen time. When processing MIR operands
(`Operand::Move` and `Operand::Copy`), the codegen creates an `OperandRef`
with an optional `move_annotation` field containing an `Instance` of the
appropriate profiling marker function. When storing the operand,
`store_with_annotation()` wraps the store operation in a synthetic debug
scope that makes it appear inlined from the marker.
Two marker functions (`compiler_move` and `compiler_copy`) are defined
in `library/core/src/profiling.rs`. These are never actually called -
they exist solely as debug info anchors.
Operations are only annotated if the type:
- Meets the size threshold (default: 65 bytes, configurable via
`-Zannotate-moves=SIZE`)
- Has a non-scalar backend representation (scalars use registers,
not memcpy)
This has a very small size impact on object file size. With the default
limit it's well under 0.1%, and even with a very small limit of 8 bytes
it's still ~1.5%. This could be enabled by default.
fd66773 to
19dddf6
Compare
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
Pushed updates:
|
Note: this is an alternative implementation of #147206; rather than being a MIR transform, it adds the annotations closer to codegen. It's functionally the same but the implementation is lower impact and it could be more correct.
This implements a new unstable compiler flag
-Zannotate-movesthat makes move and copy operations visible in profilers by creating synthetic debug information. This is achieved with zero runtime cost by manipulating debug info scopes to make moves/copies appear as calls tocompiler_move<T, SIZE>andcompiler_copy<T, SIZE>marker functions in profiling tools.This allows developers to identify expensive move/copy operations in their code using standard profiling tools, without requiring specialized tooling or runtime instrumentation.
The implementation works at codegen time. When processing MIR operands (
Operand::MoveandOperand::Copy), the codegen creates anOperandRefwith an optionalmove_annotationfield containing anInstanceof the appropriate profiling marker function. When storing the operand,store_with_annotation()wraps the store operation in a synthetic debug scope that makes it appear inlined from the marker.Two marker functions (
compiler_moveandcompiler_copy) are defined inlibrary/core/src/profiling.rs. These are never actually called - they exist solely as debug info anchors.Operations are only annotated if:
-Zannotate-moves=SIZE), and is non-zeroThis has a very small size impact on object file size. With the default limit it's well under 0.1%, and even with a very small limit of 8 bytes it's still ~1.5%. This could be enabled by default.