- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.9k
Pass fat pointers in two immediate arguments #26411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| r? @pnkfelix (rust_highfive has picked a reviewer for you, use r? to override) | 
| This is still WIP because the seemingly fragile debug backtrace test keeps failing on me because of different inlining behaviour or so, but I wanted to put this up to get some feedback on it. | 
| Will this apply to: struct NotAFatPointer { x: usize, y: isize }? | 
| No. For now this handles fat pointers only. I still plan to revamp the way 
 | 
| What are long-term goals for argument passing? Does it make sense to get it as close to some well-established ABIs (like System V on x86_64) as possible? This PR looks like a step in that direction. And maybe a silly question - why aren't fat pointers moved around as (one or two) immediates everywhere and not only in arguments? Did anyone tried it and is there any difference (during and after the optimization) compared to the current scheme? | 
| @petrochenkov the answer to your question is just that what we have now works. More specifically, before we had fat pointers we only handled word-sized or smaller types as immediates. So it stuck when we added fat pointers. | 
| We're probably going to get closer to the well-established ABIs because some things are plain better than what we currently have. But we'll also have to check where rust's semantics allow us to do even better. For example, we might be able to statically omit some copies that couldn't be omitted in C when passing things by value. In that case, in might be better if we keep passing pointers to the "copy", instead of using the copy-at-a-fixed-stack-offset mechanism which usually prohibits plain forwarding of the existing copy but needs a new copy for each callee. | 
| So the backtrace-debuginfo test fails with optimizations enabled because LLVM can tail-merge the blocks that call  https://github.com/dotdash/rust/blob/e4872167f5dda8eebc3b68a2050f870fa4457b50/src/test/run-pass/backtrace-debuginfo.rs#L92 Does anybody have an idea how to "work around" that optimization? If not, I'd like to just remove the second call, as that seems to be testing LLVM rather than rust. | 
e487216    to
    986be42      
    Compare
  
    | With the debug backtrace test fixed, this passes the test suite for me locally, so I consider this ready now. | 
| ☔ The latest upstream changes (presumably #26351) made this pull request unmergeable. Please resolve the merge conflicts. | 
…argument attributes This makes it a lot easier to later add attributes for fat pointers.
…ts result This makes it easier to support translating a single rust argument to more than one llvm argument value later.
986be42    to
    a3d66ae      
    Compare
  
    | Rebased | 
| @bors r+ | 
| 📌 Commit a3d66ae has been approved by  | 
| ⌛ Testing commit a3d66ae with merge e25c15b... | 
| 💔 Test failed - auto-mac-32-opt | 
This has a number of advantages compared to creating a copy in memory and passing a pointer. The obvious one is that we don't have to put the data into memory but can keep it in registers. Since we're currently passing a pointer anyway (instead of using e.g. a known offset on the stack, which is what the `byval` attribute would achieve), we only use a single additional register for each fat pointer, but save at least two pointers worth of stack in exchange (sometimes more because more than one copy gets eliminated). On archs that pass arguments on the stack, we save a pointer worth of stack even without considering the omitted copies. Additionally, LLVM can optimize the code a lot better, to a large degree due to the fact that lots of copies are gone or can be optimized away. Additionally, we can now emit attributes like nonnull on the data and/or vtable pointers contained in the fat pointer, potentially allowing for even more optimizations. This results in LLVM passes being about 3-7% faster (depending on the crate), and the resulting code is also a few percent smaller, for example: text data filename 5671479 3941461 before/librustc-d8ace771.so 5447663 3905745 after/librustc-d8ace771.so 1944425 2394024 before/libstd-d8ace771.so 1896769 2387610 after/libstd-d8ace771.so I had to remove a call in the backtrace-debuginfo test, because LLVM can now merge the tails of some blocks when optimizations are turned on, which can't correctly preserve line info. Fixes rust-lang#22924 Cc rust-lang#22891 (at least for fat pointers the code is good now)
a3d66ae    to
    f777562      
    Compare
  
    | @bors r=aatch | 
| 📌 Commit f777562 has been approved by  | 
This has a number of advantages compared to creating a copy in memory and passing a pointer. The obvious one is that we don't have to put the data into memory but can keep it in registers. Since we're currently passing a pointer anyway (instead of using e.g. a known offset on the stack, which is what the `byval` attribute would achieve), we only use a single additional register for each fat pointer, but save at least two pointers worth of stack in exchange (sometimes more because more than one copy gets eliminated). On archs that pass arguments on the stack, we save a pointer worth of stack even without considering the omitted copies. Additionally, LLVM can optimize the code a lot better, to a large degree due to the fact that lots of copies are gone or can be optimized away. Additionally, we can now emit attributes like nonnull on the data and/or vtable pointers contained in the fat pointer, potentially allowing for even more optimizations. This results in LLVM passes being about 3-7% faster (depending on the crate), and the resulting code is also a few percent smaller, for example: |text|data|filename| |----|----|--------| |5671479|3941461|before/librustc-d8ace771.so| |5447663|3905745|after/librustc-d8ace771.so| | | | | |1944425|2394024|before/libstd-d8ace771.so| |1896769|2387610|after/libstd-d8ace771.so| I had to remove a call in the backtrace-debuginfo test, because LLVM can now merge the tails of some blocks when optimizations are turned on, which can't correctly preserve line info. Fixes #22924 Cc #22891 (at least for fat pointers the code is good now)
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the
byvalattribute would achieve), we only use asingle additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes #22924
Cc #22891 (at least for fat pointers the code is good now)