Skip to content

Conversation

@jsgf
Copy link
Contributor

@jsgf jsgf commented Oct 17, 2025

Note: this is an alternative implementation of #147206; rather than being a MIR transform, it adds the annotations closer to codegen. It's functionally the same but the implementation is lower impact and it could be more correct.


This implements a new unstable compiler flag -Zannotate-moves that makes move and copy operations visible in profilers by creating synthetic debug information. This is achieved with zero runtime cost by manipulating debug info scopes to make moves/copies appear as calls to compiler_move<T, SIZE> and compiler_copy<T, SIZE> marker functions in profiling tools.

This allows developers to identify expensive move/copy operations in their code using standard profiling tools, without requiring specialized tooling or runtime instrumentation.

The implementation works at codegen time. When processing MIR operands (Operand::Move and Operand::Copy), the codegen creates an OperandRef with an optional move_annotation field containing an Instance of the appropriate profiling marker function. When storing the operand, store_with_annotation() wraps the store operation in a synthetic debug scope that makes it appear inlined from the marker.

Two marker functions (compiler_move and compiler_copy) are defined in library/core/src/profiling.rs. These are never actually called - they exist solely as debug info anchors.

Operations are only annotated if:

  • We're generating debug info and the feature is enabled.
  • Meets the size threshold (default: 65 bytes, configurable via -Zannotate-moves=SIZE), and is non-zero
  • Has a memory representation

This has a very small size impact on object file size. With the default limit it's well under 0.1%, and even with a very small limit of 8 bytes it's still ~1.5%. This could be enabled by default.

@rustbot
Copy link
Collaborator

rustbot commented Oct 17, 2025

Some changes occurred in compiler/rustc_codegen_ssa

cc @WaffleLapkin

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 17, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 17, 2025

r? @madsmtm

rustbot has assigned @madsmtm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@jsgf jsgf changed the title Add -Zannotate-moves for profiler visibility of move/copy operations Add -Zannotate-moves for profiler visibility of move/copy operations (codegen) Oct 17, 2025
@jsgf jsgf force-pushed the move-copy-codegen branch from c72e0b5 to 06baf13 Compare October 17, 2025 09:09
@jsgf
Copy link
Contributor Author

jsgf commented Oct 17, 2025

MCP filed rust-lang/compiler-team#928

@saethlin
Copy link
Member

Does this support cranelift? The advantage to doing such things in MIR is that it is much easier to support cranelift.

@jsgf
Copy link
Contributor Author

jsgf commented Oct 18, 2025

Yeah, that's a good point which I mentioned in the MCP. I had assumed that cranelift and gcc used the same framework as llvm so supporting them would just be a matter of implementing with_move_annotation for them. But I guess that's not the case?

I didn't check whether the MIR transform approach did actually work for cranelift or gcc (does gcc support debuginfo in general?). But if it does, then I wonder if there's a middle ground between MIR and codegen to get the best of both?

@saethlin
Copy link
Member

Ah, I'll respond on the MCP Zulip thread.

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I recall working on Ruffle years back, and having expensive copies being an issue, something like this would definitely have helped back then.

I did a surface-level review, haven't truly dug into the implementation details of it yet, I'll probably wait 'till the MCP finishes before doing so.

@rustbot blocked (on MCP)

I'm also not 100% confident in my knowledge of things here, it's likely that I'll pass it off to saethlin afterwards to do an extra review on top.

View changes since this review

@@ -0,0 +1,79 @@
# `annotate-moves`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filing as a concern so we don't forget: this will need a tracking issue.

Comment on lines +282 to +285
#[unstable(feature = "profiling_marker_api", issue = "none")]
pub mod profiling;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to add something to std/src/lib.rs to get the API re-exported in std as well - that usually documents better. Unless this intentionally isn't done?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't really sure what to do about this. These functions are never intended to be called, they aren't really part of any public API, so I didn't think there was really a need to expose them. The only real value in exposing them is in documentation so that if you see core::profiling::compiler_move<T, N> in a backtrace you can look it up and understand what it means. If they only ever appear as core then making them also appear in std could just be confusing.

I guess we could check the overall no-std state(?) and switch between std/core but that didn't really seem useful to me.

Also core::profiling/std::profiling also seemed like it could potentially be very useful namespace for some other profiling-related feature down the line. I was unsure about making it an exposed public module but functionally empty. Should that be deferred until it has some actual callable contents?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they only ever appear as core then making them also appear in std could just be confusing.

But this is already the state for all debug info, no? If you call a core method that happens to be re-exposed in std, you'd still see core in your debugger.

I was unsure about making it an exposed public module but functionally empty. Should that be deferred until it has some actual callable contents?

Unsure too, in the end, this is probably libs domain (?), I'll ping one of them for guidance once the MCP finishes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is already the state for all debug info, no? If you call a core method that happens to be re-exposed in std, you'd still see core in your debugger.

Sure, but in this case the only place you'd ever see this symbol is in debug info, since you can never call it directly. So having it visible in std::profiling would be at best pointless and redundant, but at worst confusing.

Comment on lines 14 to 16
// Note: On most ABIs, large structs are passed by pointer even when written as "by value",
// so there may not be an actual memcpy operation to attach compiler_move to. This test
// mainly verifies that IF debug info is emitted, the call itself uses the source location.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice with a test that does something like:

//@ revisions: by_pointer by_value
// Some target with an ABI that passes by pointer
//@ [by_pointer] compile-flags: --target x86_64-unknown-linux-gnu
//@ [by_pointer] needs-llvm-components: x86_64
// Some target with an ABI that passes by value (dunny one on top of my head)
//@ [by_value] compile-flags: --target abc-vendor-os
//@ [by_value] needs-llvm-components: abc

That would give me more confidence in that we're actually testing the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed this. I didn't try multi-architecture/ABI tests.

@rustbot rustbot added S-blocked Status: Blocked on something else such as an RFC or other implementation work. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 18, 2025
@jsgf jsgf requested a review from madsmtm October 19, 2025 08:30
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 19, 2025
@jsgf
Copy link
Contributor Author

jsgf commented Oct 19, 2025

Updated to address review comments:

  • refactored enable logic to push possibly expensive things later
  • a few missed stores possibly needing annotation
  • added CLI test
  • added some more test cases

@rust-log-analyzer

This comment has been minimized.

@jsgf jsgf force-pushed the move-copy-codegen branch from 632004b to fd66773 Compare October 19, 2025 23:49
@bors
Copy link
Collaborator

bors commented Oct 22, 2025

☔ The latest upstream changes (presumably #147687) made this pull request unmergeable. Please resolve the merge conflicts.

jsgf added 2 commits October 23, 2025 23:12
This implements a new unstable compiler flag `-Zannotate-moves` that makes
move and copy operations visible in profilers by creating synthetic debug
information. This is achieved with zero runtime cost by manipulating debug
info scopes to make moves/copies appear as calls to `compiler_move<T, SIZE>`
and `compiler_copy<T, SIZE>` marker functions in profiling tools.

This allows developers to identify expensive move/copy operations in their
code using standard profiling tools, without requiring specialized tooling
or runtime instrumentation.

The implementation works at codegen time. When processing MIR operands
(`Operand::Move` and `Operand::Copy`), the codegen creates an `OperandRef`
with an optional `move_annotation` field containing an `Instance` of the
appropriate profiling marker function. When storing the operand,
`store_with_annotation()` wraps the store operation in a synthetic debug
scope that makes it appear inlined from the marker.

Two marker functions (`compiler_move` and `compiler_copy`) are defined
in `library/core/src/profiling.rs`. These are never actually called -
they exist solely as debug info anchors.

Operations are only annotated if the type:
   - Meets the size threshold (default: 65 bytes, configurable via
     `-Zannotate-moves=SIZE`)
   - Has a non-scalar backend representation (scalars use registers,
     not memcpy)

This has a very small size impact on object file size. With the default
limit it's well under 0.1%, and even with a very small limit of 8 bytes
it's still ~1.5%. This could be enabled by default.
@jsgf jsgf force-pushed the move-copy-codegen branch from fd66773 to 19dddf6 Compare October 24, 2025 06:16
@rustbot
Copy link
Collaborator

rustbot commented Oct 24, 2025

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@jsgf
Copy link
Contributor Author

jsgf commented Oct 24, 2025

Pushed updates:

  • Rebased to current master and resolved the conflict.
  • Squashed all the commits because they seemed uncontroversial
  • Also tweaked a few comments and strings
  • Added a commit to switch to using lang-item for compiler_move/compiler_copy. It seems basically the same from a complexity perspective.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-blocked Status: Blocked on something else such as an RFC or other implementation work. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants