-
Notifications
You must be signed in to change notification settings - Fork 13.9k
Add a fast path for lowering trivial consts #148040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add a fast path for lowering trivial consts
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (4c15d20): comparison URL. Overall result: ❌✅ regressions and improvements - BENCHMARK(S) FAILEDBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never ❗ ❗ ❗ ❗ ❗
❗ ❗ ❗ ❗ ❗ Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.2%, secondary 2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -1.5%, secondary -0.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.6%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 476.496s -> 475.064s (-0.30%) |
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Add a fast path for lowering trivial consts
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (4931b5e): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.8%, secondary -1.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -9.2%, secondary -1.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.7%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 474.337s -> 474.199s (-0.03%) |
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Add a fast path for lowering trivial consts
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| matches!(tcx.def_kind(def), DefKind::AssocConst | DefKind::Const | DefKind::AnonConst) | ||
| } | ||
|
|
||
| fn trivial_const_provider<'tcx>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems worth moving this query impl into a separate file, to avoid growing this already-big file even bigger.
Also please add more comments, in particular explaining the overall contract this query must abide by.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a big doc comment to the main function that has all the logic that explains the contract.
| tcx.ensure_done().coroutine_by_move_body_def_id(def); | ||
| } | ||
|
|
||
| // the `trivial_const` query uses mir_built, so make sure it is run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mir_built also uses trivial_const, so I am confused... this sounds cyclic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tcx.mir_built uses the free function, not the query. So it's not cyclic, but if the first call is tcx.trivial_const, we call tcx.mir_built to get the body, which checks if the Body is trivial in order to skip its passes internally, then returns that to trivial_const_provider which analyzes the Body again.
It's a bit goofy, but that's all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That code flow feels worth putting in a comment somewhere.
Also, doesn't that mean we do the work of checking triviality twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting that I did add this to a comment in which I noted that we are checking for triviality multiple times which is undesirable.
|
@bors r+ |
|
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 34a8c73 (parent) -> 4b53279 (this PR) Test differencesShow 10 test diffs10 doctest diffs were found. These are ignored, as they are noisy. Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard 4b53279854fcc60b063398181f5dc13ddc319cb8 --output-dir test-dashboardAnd then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
|
Finished benchmarking commit (4b53279): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.5%, secondary 2.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -6.8%, secondary -4.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -1.7%, secondary -3.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 473.665s -> 474.938s (0.27%) |
|
The regressions are caused by the small growth in the size of the query graph, and they show up on crates that don't benefit from this optimization (or perhaps just don't benefit yet, see my suggestions for how to expand this in the PR description). |
|
@saethlin I'm a bit surprised by the binary size regressions, even for optimised builds. |
|
Note that we don't have many binaries in the perf. suite, those did not experience any binary size changes (https://perf.rust-lang.org/compare.html?start=34a8c7368c84fc699fc83a8851a02f93fd655931&end=4b53279854fcc60b063398181f5dc13ddc319cb8&stat=size%3Alinked_artifact&library=false&nonRelevant=true). The binary size changes reported are mostly in the crate metadata (https://perf.rust-lang.org/compare.html?start=34a8c7368c84fc699fc83a8851a02f93fd655931&end=4b53279854fcc60b063398181f5dc13ddc319cb8&stat=size%3Acrate_metadata&nonRelevant=true). |
Accept trivial consts based on trivial consts This is an expansion of #148040. The previous implementation only accepted trivial consts that assign a literal. For example: ```rust const A: usize = 0; const B: usize = A; ``` Before this PR, only `A` was a trivial const. Now `B` is too.
Accept trivial consts based on trivial consts This is an expansion of rust-lang/rust#148040. The previous implementation only accepted trivial consts that assign a literal. For example: ```rust const A: usize = 0; const B: usize = A; ``` Before this PR, only `A` was a trivial const. Now `B` is too.
|
Improvements greatly outweigh the regressions. @rustbot label: +perf-regression-triaged |
The objective of this PR is to improve compilation performance for crates that define a lot of trivial consts. This is a flamegraph of a build of a library crate that is just 100,000 trivial consts, taken from a nightly compiler:

My objective is to target all of the cycles in
eval_to_const_value_rawthat are not part ofmir_built, because if you look at themir_builtfor a trivial const, we already have the value available.In this PR, the definition of a trivial const is this:
Specifically, we look for if the
mir_builtbody is a single basic block containing one assign statement and a return terminator, where the assign statement assigns anOperand::Constant(Const::Val). The MIR dumps for these look like:The implementation is built around a new query,
trivial_const(LocalDefId) -> Option<(ConstValue, Ty)>which returns the contents of theConst::Valin themir_builtif theLocalDefIdis a trivial const.Then I added debug assertions to the beginning of
mir_for_ctfeandmir_promotedto prevent trying to get the body of a trivial const, because that would defeat the optimization here. But these are deliberately debug assertions because the consequence of failing the assertion is that compilation is slow, not corrupt. If we made these hard assertions, I'm sure there are obscure scenarios people will run into where the compiler would ICE instead of continuing on compilation, just a bit slower. I'd like to know about those, but I do not think serving up an ICE is worth it.With the assertions in place, I just added logic around all the places they were hit, to skip over trying to analyze the bodies of trivial consts.
In the future, I'd like to see this work extended by:
_1 = const 0_usize; _0 = &_1, which would make a lot of promoteds into trivial constsconst A: usize = B, which haveOperand::Constant(Const::Unevaluated)