-
Notifications
You must be signed in to change notification settings - Fork 13.7k
pub async fn impl is monomorphized when func itself is monomorphized #143290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I don't think this needs a -Z flag. It makes a lot of sense to just change this everywhere. We can then benchmark it in the benchmark suite, too. A similar change could be done for iterator or closure returning functions. On that note: instead of collecting nested bodies in general, wouldn't it be slightly more correct to collect types that are in the opaque (non-opque types must already have all their impls monomorphized, as they are publicly reachable) return type of monomorphized functions? For opaque types we'd only need to monomorphize the trait impls that the opaque type has in its bounds cc @compiler-errors for thoughts as you wrote #135314 |
d6bb74b
to
a6b81d5
Compare
Some changes occurred in coverage tests. cc @Zalathar |
a6b81d5
to
d6e2c24
Compare
Flag removed, behaviour changed to be default. |
This comment has been minimized.
This comment has been minimized.
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
pub async fn impl is monomorphized when func itself is monomorphized Implentation coroutine (`func::{closure#0}`) is monomorphized, when func itself is monomorphized. Currently, when `pub async fn foo(..)` is exported from lib and used in several dependent crates, only 'header' function is monomorphized in the defining crate. 'header' function, returning coroutine object, is monomorphized, but the coroutine's poll function (which actually implements all the logic for the function) is not. In such situation, `func::{closure#0}` will be monomorphized in every dependency. This PR adds monomorphization for `func::{closure#0}` (coroutine poll function), when func itself is monomorphized. Simple test with one lib async function and ten dependent crates (executable) that use the function, shows 5-7% compilation time improvement (single-threaded).
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (9443527): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -3.1%, secondary 2.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 2.5%, secondary 4.9%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 9.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 462.08s -> 462.853s (0.17%) |
Some tests need blessing The performance regression is expected, as the regressing benchmark has public |
In particular, you will need to set |
d6e2c24
to
57c2602
Compare
This comment has been minimized.
This comment has been minimized.
It looks like '-Z print-type-sizes' shows different results for x86_64 and aarch64 for some tests. Are there some helpfull flags to remove the difference between targets (except Also, I performed |
you could limit the test to one platform with |
57c2602
to
e6b35f4
Compare
☔ The latest upstream changes (presumably #144249) made this pull request unmergeable. Please resolve the merge conflicts. |
oh sorry I didn't realize you addressed this. @bors delegate+ r=me after a rebase |
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
97b9fc4
to
961e96a
Compare
@bors r=oli-obk |
pub async fn impl is monomorphized when func itself is monomorphized Implentation coroutine (`func::{closure#0}`) is monomorphized, when func itself is monomorphized. Currently, when `pub async fn foo(..)` is exported from lib and used in several dependent crates, only 'header' function is monomorphized in the defining crate. 'header' function, returning coroutine object, is monomorphized, but the coroutine's poll function (which actually implements all the logic for the function) is not. In such situation, `func::{closure#0}` will be monomorphized in every dependency. This PR adds monomorphization for `func::{closure#0}` (coroutine poll function), when func itself is monomorphized. Simple test with one lib async function and ten dependent crates (executable) that use the function, shows 5-7% compilation time improvement (single-threaded).
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
…hized, when func itself is monomorphized
961e96a
to
c2c58cb
Compare
@bors r=oli-obk |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 84a1747 (parent) -> c0bb3b9 (this PR) Test differencesShow 13 test diffsStage 1
Stage 2
Additionally, 2 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard c0bb3b98bb7aac24a37635e5d36d961e0b14f435 --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (c0bb3b9): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (secondary 4.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 4.7%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 9.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 468.417s -> 466.41s (-0.43%) |
The regression is expected, as we now do more work for async fns in their crate, with the hope of reducing the amount of work required in downstream crates. @rustbot label: +perf-regression-triaged |
Implentation coroutine (
func::{closure#0}
) is monomorphized, when func itself is monomorphized.Currently, when
pub async fn foo(..)
is exported from lib and used in several dependent crates, only 'header' function is monomorphized in the defining crate. 'header' function, returning coroutine object, is monomorphized, but the coroutine's poll function (which actually implements all the logic for the function) is not. In such situation,func::{closure#0}
will be monomorphized in every dependency.This PR adds monomorphization for
func::{closure#0}
(coroutine poll function), when func itself is monomorphized.Simple test with one lib async function and ten dependent crates (executable) that use the function, shows 5-7% compilation time improvement (single-threaded).