Skip to content

Conversation

@saethlin
Copy link
Member

@saethlin saethlin commented Oct 23, 2025

The objective of this PR is to improve compilation performance for crates that define a lot of trivial consts. This is a flamegraph of a build of a library crate that is just 100,000 trivial consts, taken from a nightly compiler:
2025-10-25-164005_842x280_scrot
My objective is to target all of the cycles in eval_to_const_value_raw that are not part of mir_built, because if you look at the mir_built for a trivial const, we already have the value available.

In this PR, the definition of a trivial const is this:

const A: usize = 0;

Specifically, we look for if the mir_built body is a single basic block containing one assign statement and a return terminator, where the assign statement assigns an Operand::Constant(Const::Val). The MIR dumps for these look like:

const A: usize = {
    let mut _0: usize;

    bb0: {
        _0 = const 0_usize;
        return;
    }
}

The implementation is built around a new query, trivial_const(LocalDefId) -> Option<(ConstValue, Ty)> which returns the contents of the Const::Val in the mir_built if the LocalDefId is a trivial const.

Then I added debug assertions to the beginning of mir_for_ctfe and mir_promoted to prevent trying to get the body of a trivial const, because that would defeat the optimization here. But these are deliberately debug assertions because the consequence of failing the assertion is that compilation is slow, not corrupt. If we made these hard assertions, I'm sure there are obscure scenarios people will run into where the compiler would ICE instead of continuing on compilation, just a bit slower. I'd like to know about those, but I do not think serving up an ICE is worth it.

With the assertions in place, I just added logic around all the places they were hit, to skip over trying to analyze the bodies of trivial consts.

In the future, I'd like to see this work extended by:

  • Pushing detection of trivial consts before MIR building
  • Including DefKind::Static and DefKind::InlineConst
  • Including consts like _1 = const 0_usize; _0 = &_1, which would make a lot of promoteds into trivial consts
  • Handling less-trivial consts like const A: usize = B, which have Operand::Constant(Const::Unevaluated)

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 23, 2025
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 23, 2025
Add a fast path for lowering trivial consts
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 23, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 23, 2025

☀️ Try build successful (CI)
Build commit: 4c15d20 (4c15d2003befc82fb2064960f5520c8643947469, parent: 6501e64fcb02d22b49d6e59d10a7692ec8095619)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4c15d20): comparison URL.

Overall result: ❌✅ regressions and improvements - BENCHMARK(S) FAILED

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

❗ ❗ ❗ ❗ ❗
Warning ⚠️: The following benchmark(s) failed to build:

  • serde-1.0.219-threads4

❗ ❗ ❗ ❗ ❗

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.9%] 77
Regressions ❌
(secondary)
0.5% [0.0%, 1.6%] 29
Improvements ✅
(primary)
-2.6% [-5.6%, -1.8%] 13
Improvements ✅
(secondary)
-2.5% [-2.8%, -2.3%] 3
All ❌✅ (primary) -0.1% [-5.6%, 0.9%] 90

Max RSS (memory usage)

Results (primary -0.2%, secondary 2.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.7% [0.4%, 1.3%] 7
Regressions ❌
(secondary)
7.1% [4.6%, 8.2%] 4
Improvements ✅
(primary)
-1.7% [-2.0%, -1.5%] 4
Improvements ✅
(secondary)
-1.3% [-2.2%, -0.8%] 5
All ❌✅ (primary) -0.2% [-2.0%, 1.3%] 11

Cycles

Results (primary -1.5%, secondary -0.4%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
3.5% [3.5%, 3.5%] 1
Regressions ❌
(secondary)
3.0% [2.8%, 3.3%] 3
Improvements ✅
(primary)
-2.5% [-3.2%, -2.0%] 5
Improvements ✅
(secondary)
-3.0% [-6.0%, -1.7%] 4
All ❌✅ (primary) -1.5% [-3.2%, 3.5%] 6

Binary size

Results (primary -0.6%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.8%] 72
Regressions ❌
(secondary)
0.0% [0.0%, 0.2%] 14
Improvements ✅
(primary)
-8.9% [-8.9%, -8.8%] 8
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 2
All ❌✅ (primary) -0.6% [-8.9%, 0.8%] 80

Bootstrap: 476.496s -> 475.064s (-0.30%)
Artifact size: 390.49 MiB -> 390.55 MiB (0.02%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Oct 24, 2025
@rust-log-analyzer

This comment has been minimized.

@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rust-bors

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 24, 2025
Add a fast path for lowering trivial consts
@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 24, 2025
@rust-bors
Copy link

rust-bors bot commented Oct 24, 2025

☀️ Try build successful (CI)
Build commit: 4931b5e (4931b5e2c7edb857fab6e19e39dd4e4e22a37a91, parent: ab925646fae038b02bd462cd328ae9eef1639236)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4931b5e): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Benchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @rustbot label: +perf-regression-triaged. If not, please fix the regressions and do another perf run. If its results are neutral or positive, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.1%, 0.9%] 67
Regressions ❌
(secondary)
0.5% [0.0%, 1.6%] 32
Improvements ✅
(primary)
-9.7% [-15.6%, -5.5%] 13
Improvements ✅
(secondary)
-2.5% [-2.8%, -2.4%] 3
All ❌✅ (primary) -1.3% [-15.6%, 0.9%] 80

Max RSS (memory usage)

Results (primary -0.8%, secondary -1.1%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
1.0% [0.5%, 3.5%] 11
Regressions ❌
(secondary)
4.9% [0.8%, 8.0%] 6
Improvements ✅
(primary)
-2.8% [-4.0%, -1.3%] 10
Improvements ✅
(secondary)
-4.1% [-5.4%, -2.4%] 12
All ❌✅ (primary) -0.8% [-4.0%, 3.5%] 21

Cycles

Results (primary -9.2%, secondary -1.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.1% [0.9%, 10.7%] 6
Improvements ✅
(primary)
-9.2% [-14.1%, -4.2%] 13
Improvements ✅
(secondary)
-4.2% [-8.6%, -2.1%] 13
All ❌✅ (primary) -9.2% [-14.1%, -4.2%] 13

Binary size

Results (primary -0.7%, secondary 0.0%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.3% [0.0%, 0.8%] 72
Regressions ❌
(secondary)
0.1% [0.0%, 0.2%] 22
Improvements ✅
(primary)
-9.0% [-9.1%, -9.0%] 8
Improvements ✅
(secondary)
-0.1% [-0.2%, -0.0%] 2
All ❌✅ (primary) -0.7% [-9.1%, 0.8%] 80

Bootstrap: 474.337s -> 474.199s (-0.03%)
Artifact size: 390.48 MiB -> 390.80 MiB (0.08%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 25, 2025
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

rust-bors bot added a commit that referenced this pull request Oct 25, 2025
Add a fast path for lowering trivial consts
@rust-bors

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 25, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-bors
Copy link

rust-bors bot commented Oct 25, 2025

☀️ Try build successful (CI)
Build commit: 8907237 (8907237f16e88f2a9ee9a48ba052804abac7e239, parent: f435972085b697a1ece8ee6a1ac76efff8d1df7b)

@rust-timer

This comment has been minimized.

@rustbot
Copy link
Collaborator

rustbot commented Oct 25, 2025

Some changes occurred to MIR optimizations

cc @rust-lang/wg-mir-opt

Some changes occurred to the CTFE machinery

cc @RalfJung, @oli-obk, @lcnr

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 25, 2025
matches!(tcx.def_kind(def), DefKind::AssocConst | DefKind::Const | DefKind::AnonConst)
}

fn trivial_const_provider<'tcx>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems worth moving this query impl into a separate file, to avoid growing this already-big file even bigger.

Also please add more comments, in particular explaining the overall contract this query must abide by.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a big doc comment to the main function that has all the logic that explains the contract.

tcx.ensure_done().coroutine_by_move_body_def_id(def);
}

// the `trivial_const` query uses mir_built, so make sure it is run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mir_built also uses trivial_const, so I am confused... this sounds cyclic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tcx.mir_built uses the free function, not the query. So it's not cyclic, but if the first call is tcx.trivial_const, we call tcx.mir_built to get the body, which checks if the Body is trivial in order to skip its passes internally, then returns that to trivial_const_provider which analyzes the Body again.

It's a bit goofy, but that's all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code flow feels worth putting in a comment somewhere.

Also, doesn't that mean we do the work of checking triviality twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that I did add this to a comment in which I noted that we are checking for triviality multiple times which is undesirable.

@oli-obk
Copy link
Contributor

oli-obk commented Oct 27, 2025

@bors r+

@bors
Copy link
Collaborator

bors commented Oct 27, 2025

📌 Commit a63035f has been approved by oli-obk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 27, 2025
@bors
Copy link
Collaborator

bors commented Oct 27, 2025

⌛ Testing commit a63035f with merge 4b53279...

@bors
Copy link
Collaborator

bors commented Oct 27, 2025

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 4b53279 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 27, 2025
@bors bors merged commit 4b53279 into rust-lang:master Oct 27, 2025
12 checks passed
@rustbot rustbot added this to the 1.93.0 milestone Oct 27, 2025
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 34a8c73 (parent) -> 4b53279 (this PR)

Test differences

Show 10 test diffs

10 doctest diffs were found. These are ignored, as they are noisy.

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 4b53279854fcc60b063398181f5dc13ddc319cb8 --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. aarch64-apple: 7133.3s -> 10864.5s (52.3%)
  2. i686-gnu-1: 7316.0s -> 8265.3s (13.0%)
  3. aarch64-gnu-llvm-20-2: 2294.2s -> 2571.1s (12.1%)
  4. x86_64-gnu-llvm-20: 2528.1s -> 2826.0s (11.8%)
  5. aarch64-gnu-debug: 3989.6s -> 4445.5s (11.4%)
  6. x86_64-gnu-gcc: 3053.5s -> 3394.6s (11.2%)
  7. x86_64-gnu-miri: 4386.6s -> 4870.1s (11.0%)
  8. i686-gnu-2: 5594.6s -> 6168.3s (10.3%)
  9. arm-android: 5817.8s -> 6386.1s (9.8%)
  10. tidy: 173.1s -> 189.9s (9.7%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (4b53279): comparison URL.

Overall result: ❌✅ regressions and improvements - please read the text below

Our benchmarks found a performance regression caused by this PR.
This might be an actual regression, but it can also be just noise.

Next Steps:

  • If the regression was expected or you think it can be justified,
    please write a comment with sufficient written justification, and add
    @rustbot label: +perf-regression-triaged to it, to mark the regression as triaged.
  • If you think that you know of a way to resolve the regression, try to create
    a new PR with a fix for the regression.
  • If you do not understand the regression or you think that it is just noise,
    you can ask the @rust-lang/wg-compiler-performance working group for help (members of this group
    were already notified of this PR).

@rustbot label: +perf-regression
cc @rust-lang/wg-compiler-performance

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
0.3% [0.2%, 0.4%] 14
Regressions ❌
(secondary)
0.5% [0.0%, 2.1%] 36
Improvements ✅
(primary)
-3.7% [-15.5%, -0.1%] 71
Improvements ✅
(secondary)
-4.0% [-8.2%, -0.0%] 22
All ❌✅ (primary) -3.0% [-15.5%, 0.4%] 85

Max RSS (memory usage)

Results (primary -3.5%, secondary 2.5%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
5.0% [1.5%, 7.5%] 7
Improvements ✅
(primary)
-3.5% [-9.5%, -0.5%] 29
Improvements ✅
(secondary)
-1.9% [-2.6%, -0.5%] 4
All ❌✅ (primary) -3.5% [-9.5%, -0.5%] 29

Cycles

Results (primary -6.8%, secondary -4.6%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
2.2% [2.2%, 2.2%] 1
Regressions ❌
(secondary)
2.9% [2.0%, 3.8%] 2
Improvements ✅
(primary)
-7.1% [-13.6%, -2.5%] 29
Improvements ✅
(secondary)
-5.7% [-8.2%, -2.2%] 13
All ❌✅ (primary) -6.8% [-13.6%, 2.2%] 30

Binary size

Results (primary -1.7%, secondary -3.8%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
0.2% [0.0%, 0.7%] 64
Regressions ❌
(secondary)
0.2% [0.0%, 0.3%] 17
Improvements ✅
(primary)
-4.9% [-12.5%, -0.9%] 40
Improvements ✅
(secondary)
-9.5% [-14.9%, -0.0%] 12
All ❌✅ (primary) -1.7% [-12.5%, 0.7%] 104

Bootstrap: 473.665s -> 474.938s (0.27%)
Artifact size: 390.50 MiB -> 390.59 MiB (0.02%)

@saethlin saethlin deleted the trivial-consts branch October 27, 2025 16:48
@saethlin saethlin added the perf-regression-triaged The performance regression has been triaged. label Oct 27, 2025
@saethlin
Copy link
Member Author

The regressions are caused by the small growth in the size of the query graph, and they show up on crates that don't benefit from this optimization (or perhaps just don't benefit yet, see my suggestions for how to expand this in the PR description).

@therealprof
Copy link
Contributor

@saethlin I'm a bit surprised by the binary size regressions, even for optimised builds.

@Kobzol
Copy link
Member

Kobzol commented Oct 28, 2025

bors added a commit that referenced this pull request Oct 28, 2025
Accept trivial consts based on trivial consts

This is an expansion of #148040.

The previous implementation only accepted trivial consts that assign a literal. For example:
```rust
const A: usize = 0;
const B: usize = A;
```
Before this PR, only `A` was a trivial const. Now `B` is too.
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Nov 3, 2025
Accept trivial consts based on trivial consts

This is an expansion of rust-lang/rust#148040.

The previous implementation only accepted trivial consts that assign a literal. For example:
```rust
const A: usize = 0;
const B: usize = A;
```
Before this PR, only `A` was a trivial const. Now `B` is too.
@Kobzol
Copy link
Member

Kobzol commented Nov 3, 2025

Improvements greatly outweigh the regressions.

@rustbot label: +perf-regression-triaged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants