Arm thumbv6m binary size increase with dead code change (or from 1.50 to 1.51)

I've observed an unexpected increase in binary size in response to a [change](https://github.com/stm32-rs/stm32g0xx-hal/commit/77dace37908f281feb9432fc13874475d9dc0765) in a crate that we use. The change only adds new public methods, which we don't call, so all the changed code is effectively dead code, but still it results in a significant increase in our binary size. My guess is that the presence of this new code causes LLVM to make different inlining decisions, even though the new code isn't actually called anywhere.

This happens on 1.50.0. The increase (for a minimal binary included below) is from 932 bytes to 2164 bytes.

Switching from 1.50 to 1.51 (currently in beta) without the above change causes the same increase from 932 bytes to 2164 bytes.

I was going to mark this as a stable to beta regression, but TBH, I think it's probably a pre-existing issue that just triggers in response to legitimate changes in library code. I expect that whatever changed between 1.50 and 1.51 is similar in nature to the code change above.

I've tarred up a moderately minimal bit of code that reproduces this:

[binary-size-increase.tar.gz](https://github.com/rust-lang/rust/files/6081091/binary-size-increase.tar.gz)

To reproduce, run the `./check-size` script contained within the tarball. You might need to `rustup target install thumbv6m-none-eabi` first.

For me, with current stable 1.50, this shows a change in binary size from 932 bytes to 2164 bytes:

```
Size prior to commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 932 Mar  4 16:39 a.bin
Size after commit 77dace37908f281feb9432fc13874475d9dc0765
-rwxr-xr-x 1 dml eng 2164 Mar  4 16:39 b.bin
```

If I adjust the script to use 1.51, then I get 2164 bytes for both.

Looking at the disassembly of each binary, it seems that the larger binary includes `compiler_builtins::int::specialized_div_rem::u64_div_rem`, where the smaller binary doesn't. `u64_div_rem` is called from `__udivmoddi4`, which is called from `__aeabi_uldivmod`. These are also absent from the smaller binary, but present and called from `MicroSecond::cycles` / `Delay::delay` in the larger binary.

Cargo.toml sets opt-level = "s". Similar results are observed with opt-level = "z".

Given that LTO is enabled, I'd have expected that dead code would be removed before inlining decisions were made, so I'm surprised that a change to code that isn't called would have this effect.

If there's anything we can do to help LLVM make more optimal decisions when optimizing for binary size, that'd be awesome, although I'm sure it's a pretty difficult problem.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Arm thumbv6m binary size increase with dead code change (or from 1.50 to 1.51) #82748

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Arm thumbv6m binary size increase with dead code change (or from 1.50 to 1.51) #82748

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions