Skip to content

Fix linker-plugin-lto only doing thin lto #136840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Flakebi
Copy link
Contributor

@Flakebi Flakebi commented Feb 10, 2025

When rust provides LLVM bitcode files to lld and the bitcode contains
function summaries as used for thin lto, lld defaults to using thin lto.
This prevents some optimizations that are only applied for fat lto.

We solve this by not creating function summaries when fat lto is
enabled. The bitcode for the module is just directly written out.

An alternative solution would be to set the ThinLTO=0 module flag to
signal lld to do fat lto.
The code in clang that sets this flag is here:
https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150
The code in LLVM that queries the flag and defaults to thin lto if not
set is here:
https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446

@rustbot
Copy link
Collaborator

rustbot commented Feb 10, 2025

Could not assign reviewer from: workingjubilee.
User(s) workingjubilee are either the PR author, already assigned, or on vacation. Please use r? to specify someone else to assign.

@rustbot
Copy link
Collaborator

rustbot commented Feb 10, 2025

r? @jieyouxu

rustbot has assigned @jieyouxu.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 10, 2025
@rustbot
Copy link
Collaborator

rustbot commented Feb 10, 2025

This PR modifies tests/run-make/. If this PR is trying to port a Makefile
run-make test to use rmake.rs, please update the
run-make port tracking issue
so we can track our progress. You can either modify the tracking issue
directly, or you can comment on the tracking issue and link this PR.

cc @jieyouxu

@Flakebi Flakebi mentioned this pull request Feb 10, 2025
21 tasks
@workingjubilee
Copy link
Member

I have barely any idea about LTO besides "it happens and it involves dlopening a compiler and shoving its serialized data back in it" tbh soo

@jieyouxu
Copy link
Member

Unfortunately I have no clue either, so

r? compiler

@rustbot rustbot assigned fee1-dead and unassigned jieyouxu Feb 11, 2025
@Flakebi
Copy link
Contributor Author

Flakebi commented Feb 11, 2025

For reference, the code that switches to thin lto when the flag is not set is here: https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446

  // By default we compile with ThinLTO if the module has a summary, but the
  // client can request full LTO with a module flag.
  bool IsThinLTO = true;
  if (auto *MD =
          mdconst::extract_or_null<ConstantInt>(M.getModuleFlag("ThinLTO")))
    IsThinLTO = MD->getZExtValue();

The code in clang that sets the flag, which is replicated here for Rust is here: https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150

        if (!TheModule->getModuleFlag("ThinLTO") && !CodeGenOpts.UnifiedLTO)
          TheModule->addModuleFlag(llvm::Module::Error, "ThinLTO", uint32_t(0));

// Disable ThinLTO if fat lto is requested. Otherwise lld defaults to thin lto.
if sess.lto() == config::Lto::Fat {
llvm::add_module_flag_u32(llmod, llvm::ModuleFlagMergeBehavior::Override, "ThinLTO", 0);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if a dependency is built with lto=true (aka lto=fat), but then the user wants to use thinLTO? I'm pretty sure the standard library is built with lto=true for example, but that shouldn't prevent thinLTO from ever working.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, it seems to change somewhat, but still work in general. I added a test for this.
What changes: Without this change, the test passes when
lib is compiled with O0 and main with O3 and

  1. lib uses lto=thin and main uses lto=thin
  2. lib uses lto=thin and main uses lto=fat
  3. lib uses lto=fat and main uses lto=thin
  4. lib uses lto=fat and main uses lto=fat

With this change, all of these keep passing except for case 3 (lib uses lto=fat and main uses lto=thin).
When lib is compiled with O1, O2 or O3, case 3 passes as well.
I assume this is the important case, as the standard library is compiled with optimizations.
(And lto with O0 is kinda questionable, except maybe for nvptx and amdgpu, but they require lto=fat anyway.)

@rust-log-analyzer

This comment has been minimized.

@fee1-dead
Copy link
Member

r? compiler

@rustbot rustbot assigned SparrowLii and unassigned fee1-dead Feb 16, 2025
Copy link
Member

@SparrowLii SparrowLii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know little about lto, either. I think it would be much more acceptable if this PR could limit the change to amdhsa conditions.

r? compiler

@@ -290,6 +290,11 @@ pub(crate) unsafe fn create_module<'ll>(
);
}

// Disable ThinLTO if fat lto is requested. Otherwise lld defaults to thin lto.
Copy link
Member

@SparrowLii SparrowLii Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds counterintuitive. Can you explain the relationship between the user's lto option and llvm's lto in the comments?

And I think it needs a individual test to ensure that the previous lto=fat option is not affected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the comment, is it clearer now?

(I want to affect the current lto=fat option, as it currently does thin lto, which I think is not intended and a bug :))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that the current behavior is surprising, since I ran into this issue with autodiff and it took multiple months to find out that despite selecting lto=fat there is still a thin-lto module in the compilation pipeline.

@rustbot rustbot assigned Nadrieril and BoxyUwU and unassigned SparrowLii Feb 17, 2025
@Kobzol
Copy link
Member

Kobzol commented Feb 17, 2025

@dianqk Does this interact with your recent patch?

@dianqk
Copy link
Member

dianqk commented Feb 17, 2025

@dianqk Does this interact with your recent patch?

IMO, they aren't directly related.

@Nadrieril
Copy link
Member

r? codegen

@rustbot rustbot assigned saethlin and unassigned Nadrieril and BoxyUwU Feb 17, 2025
@Flakebi
Copy link
Contributor Author

Flakebi commented Mar 28, 2025

Keep check for emit_thin_lto and revert change in tests/run-make/issue-84395-lto-embed-bitcode/rmake.rs (diff).
Thanks for the quick reviews!

Fix the test for #84395 by removing the -Zemit-thin-lto=no flag and instead setting -Clto=fat (diff to previous version). As mentioned in that issue, clang also requires setting -flto when using -lto-embed-bitcode=optimized and otherwise runs into the same issue, so I guess that’s another improvement.

Hmm, I think -Zemit-thin-lto=no and -Clto=fat have different meanings, one is not submitting the Thin LTO buffer, while the other is performing Rust's Fat LTO, but -Clto=fat should imply -Zemit-thin-lto=no.

Should they be different? As far as I understand #84395, -Zemit-thin-lto was introduced as a workaround to make it somehow work, not as a proper fix. According to llvm/llvm-project#86946, clang behaves the same as Rust (with the fix in this PR): fat lto works with -lto-embed-bitcode=optimized and thin lto doesn’t. I think the problem that thin lto doesn’t work needs a fix in lld. Neither rustc nor clang are in a good position to fix this, unless we want to inspect linker flags and behave differently based on that.

Btw, -Zemit-thin-lto=no should also fix your issue?

It does, but I think that’s a workaround that should not be needed :)

@dianqk
Copy link
Member

dianqk commented Mar 28, 2025

Fix the test for #84395 by removing the -Zemit-thin-lto=no flag and instead setting -Clto=fat (diff to previous version). As mentioned in that issue, clang also requires setting -flto when using -lto-embed-bitcode=optimized and otherwise runs into the same issue, so I guess that’s another improvement.

Hmm, I think -Zemit-thin-lto=no and -Clto=fat have different meanings, one is not submitting the Thin LTO buffer, while the other is performing Rust's Fat LTO, but -Clto=fat should imply -Zemit-thin-lto=no.

Should they be different? As far as I understand #84395, -Zemit-thin-lto was introduced as a workaround to make it somehow work, not as a proper fix. According to llvm/llvm-project#86946, clang behaves the same as Rust (with the fix in this PR): fat lto works with -lto-embed-bitcode=optimized and thin lto doesn’t. I think the problem that thin lto doesn’t work needs a fix in lld. Neither rustc nor clang are in a good position to fix this, unless we want to inspect linker flags and behave differently based on that.

This will be useful when we don't want to perform Rust's own LTO.

@dianqk
Copy link
Member

dianqk commented Mar 28, 2025

@bors r+

@bors
Copy link
Collaborator

bors commented Mar 28, 2025

📌 Commit 660d1e2 has been approved by dianqk

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 28, 2025
@bors
Copy link
Collaborator

bors commented Mar 28, 2025

⌛ Testing commit 660d1e2 with merge a5112c8...

bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 28, 2025
Fix linker-plugin-lto only doing thin lto

When rust provides LLVM bitcode files to lld and the bitcode contains
function summaries as used for thin lto, lld defaults to using thin lto.
This prevents some optimizations that are only applied for fat lto.

We solve this by not creating function summaries when fat lto is
enabled. The bitcode for the module is just directly written out.

An alternative solution would be to set the `ThinLTO=0` module flag to
signal lld to do fat lto.
The code in clang that sets this flag is here:
https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150
The code in LLVM that queries the flag and defaults to thin lto if not
set is here:
https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Mar 28, 2025

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 28, 2025
@jieyouxu
Copy link
Member

jieyouxu commented May 1, 2025

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 1, 2025
@rustbot
Copy link
Collaborator

rustbot commented May 1, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

VlaDexa added a commit to VlaDexa/rust that referenced this pull request May 2, 2025
…ianqk

Fix linker-plugin-lto only doing thin lto

When rust provides LLVM bitcode files to lld and the bitcode contains
function summaries as used for thin lto, lld defaults to using thin lto.
This prevents some optimizations that are only applied for fat lto.

We solve this by not creating function summaries when fat lto is
enabled. The bitcode for the module is just directly written out.

An alternative solution would be to set the `ThinLTO=0` module flag to
signal lld to do fat lto.
The code in clang that sets this flag is here:
https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150
The code in LLVM that queries the flag and defaults to thin lto if not
set is here:
https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446
@GuillaumeGomez
Copy link
Member

Since it failed CI, let's remove it from the merge queue.

@bors r-

@jieyouxu
Copy link
Member

jieyouxu commented May 9, 2025

Since it failed CI, let's remove it from the merge queue.

Weird, that should've been kicked out automatically because full CI failed.

@GuillaumeGomez
Copy link
Member

I think bors is a bit behind. There were 3 merged PRs still in the queue. I ran a synchronization and then these two poped up. So anyway. :')

@bors
Copy link
Collaborator

bors commented Jul 10, 2025

☔ The latest upstream changes (presumably #143731) made this pull request unmergeable. Please resolve the merge conflicts.

Flakebi added 2 commits July 27, 2025 17:06
When rust provides LLVM bitcode files to lld and the bitcode contains
function summaries as used for thin lto, lld defaults to using thin lto.
This prevents some optimizations that are only applied for fat lto.

We solve this by not creating function summaries when fat lto is
enabled. The bitcode for the module is just directly written out.

An alternative solution would be to set the `ThinLTO=0` module flag to
signal lld to do fat lto.
The code in clang that sets this flag is here:
https://github.com/llvm/llvm-project/blob/560149b5e3c891c64899e9912e29467a69dc3a4c/clang/lib/CodeGen/BackendUtil.cpp#L1150
The code in LLVM that queries the flag and defaults to thin lto if not
set is here:
https://github.com/llvm/llvm-project/blob/e258bca9505f35e0a22cb213a305eea9b76d11ea/llvm/lib/Bitcode/Writer/BitcodeWriter.cpp#L4441-L4446
In CI, the cross-lang-lto-clang passed on x86_64 but not on aarch64.
I don’t have a good way to debug this, but with some local changes
(my clang to cross-compiling to aarch64 uses different target features,
so nothing gets inlined by default; and I wasn’t able to link a rust std
for aarch64), I think I found the culprit.

With fat lto on aarch64, the inline(never) rust function gets inlined.
This is the same behavior as with linking a C library into a Rust
binary, just that it’s different between aarch64 and x86 (fat lto on x86
inlines the noinline C function into the Rust binary but it does
not inline the inline(never) Rust function into the C binary).
@Flakebi Flakebi force-pushed the linker-plugin-lto-fat branch from 660d1e2 to b5b0282 Compare July 29, 2025 22:20
@rustbot rustbot added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jul 29, 2025
@Flakebi
Copy link
Contributor Author

Flakebi commented Jul 29, 2025

I finally found some time to look at this again. Debugging the aarch64 test failure from an x86 system was not the most straightforward thing 😅

I rebased to fix conflicts, new changes are in the second commit b5b0282.

In CI, the cross-lang-lto-clang passed on x86_64 but not on aarch64. I don’t have a good way to debug this, but with some local changes (my clang to cross-compiling to aarch64 uses different target features, so nothing gets inlined by default; and I wasn’t able to link a rust std for aarch64), I think I found the culprit.
With fat lto on aarch64, the inline(never) rust function gets inlined. This is the same behavior as with linking a C library into a Rust binary, just that it’s different between aarch64 and x86 (fat lto on x86 inlines the noinline C function into the Rust binary but it does not inline the inline(never) Rust function into the C binary).

@rust-log-analyzer

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.