- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Micro-optimization attempt in coroutine layout computation #147858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt | 
| rustbot has assigned @jdonszelmann. Use  | 
| @bors try @rust-timer queue | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
Micro-optimization attempt in coroutine layout computation
| @bors try @rust-timer queue | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
Micro-optimization attempt in coroutine layout computation
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
|  | ||
| // Gather live local types and their indices. | ||
| let mut locals = IndexVec::<CoroutineSavedLocal, _>::new(); | ||
| let mut tys = IndexVec::<CoroutineSavedLocal, _>::new(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do simpler? The original code but using IndexVec::with_capacity?
| // Build the coroutine variant field list. | ||
| // Create a map from local indices to coroutine struct indices. | ||
| let mut variant_fields: IndexVec<VariantIdx, IndexVec<FieldIdx, CoroutineSavedLocal>> = | ||
| iter::repeat(IndexVec::new()).take(CoroutineArgs::RESERVED_VARIANTS).collect(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of pushing, could we use IndexVec::from_elem_n(IndexVec::new(), CoroutineArgs::RESERVED_VARIANTS + live_locals_at_suspension_points.len())? And then assign to each element?
| @cjgillot both suggestions sound good, but I'd like to try them after this perf run to see if they have any effect on it | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| Finished benchmarking commit (32cc823): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (primary 2.4%, secondary 3.8%)A less reliable metric. May be of interest, but not used to determine the overall result above. 
 Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 471.955s -> 474.126s (0.46%) | 
6f682c2    to
    a67c615      
    Compare
  
    | This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. | 
| @cjgillot might have gone a bit overboard here with the  I don't mind just closing this PR since it doesn't affect perf, maybe it's not worth touching this at all. | 
| r? compiler | 
| @bors try @rust-timer queue | 
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
Micro-optimization attempt in coroutine layout computation
      
        
              This comment has been minimized.
        
        
      
    
  This comment has been minimized.
| Finished benchmarking commit (f6307c2): comparison URL. Overall result: no relevant changes - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary 0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above. 
 CyclesResults (secondary -4.3%)A less reliable metric. May be of interest, but not used to determine the overall result above. 
 Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 473.373s -> 475.989s (0.55%) | 
| @bors r+ rollup | 
…opt, r=cjgillot Micro-optimization attempt in coroutine layout computation In `compute_layout`, there were a bunch of collections (`IndexVec`s) that were being created by `push`ing in a loop, instead of a, hopefully, more performant usage of iterator combinators. [Second commit](rust-lang@6f682c2) is just a small cleanup. I'd love a perf run to see if this shows up in benchmarks.
Rollup of 9 pull requests Successful merges: - #138217 (Turn `Cow::is_borrowed,is_owned` into associated functions.) - #147858 (Micro-optimization attempt in coroutine layout computation) - #147923 (Simplify rustc_public context handling) - #147935 (Add LLVM realtime sanitizer) - #148115 (rustdoc: Rename unstable option `--nocapture` to `--no-capture` in accordance with `libtest`) - #148137 (Couple of changes for Redox OS) - #148176 ([rustdoc] Include attribute and derive macros when filtering on "macros") - #148193 (Remove `QPath::LangItem`) - #148253 (Handle default features and -Ctarget-features in the dummy backend) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 8 pull requests Successful merges: - #138217 (Turn `Cow::is_borrowed,is_owned` into associated functions.) - #147858 (Micro-optimization attempt in coroutine layout computation) - #147923 (Simplify rustc_public context handling) - #148115 (rustdoc: Rename unstable option `--nocapture` to `--no-capture` in accordance with `libtest`) - #148137 (Couple of changes for Redox OS) - #148176 ([rustdoc] Include attribute and derive macros when filtering on "macros") - #148253 (Handle default features and -Ctarget-features in the dummy backend) - #148272 (Align VEX V5 boot routine to 4 bytes) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #147858 - yotamofek:pr/mir/coroutine-layout-opt, r=cjgillot Micro-optimization attempt in coroutine layout computation In `compute_layout`, there were a bunch of collections (`IndexVec`s) that were being created by `push`ing in a loop, instead of a, hopefully, more performant usage of iterator combinators. [Second commit](6f682c2) is just a small cleanup. I'd love a perf run to see if this shows up in benchmarks.
In
compute_layout, there were a bunch of collections (IndexVecs) that were being created bypushing in a loop, instead of a, hopefully, more performant usage of iterator combinators. Second commit is just a small cleanup.I'd love a perf run to see if this shows up in benchmarks.