Skip to content

Conversation

@alexheretic
Copy link
Contributor

@alexheretic alexheretic commented Nov 11, 2025

This is an interesting performance trade-off. We can significantly improve inline SmolStr::clone performance at the expense of heap performance if we use #[cold] & #[inline(never)] hints against the latter.

This is a better result for users that care most about inline variant performance. If we think that represents most users it may be worth it.

Benchmarks

  • inline clones 4.1ns -> 0.8ns 🙂
  • heap clones 2.8ns -> 5.8ns 🙁
group                      master                  pr
-----                      ------                  --
SmolStr::clone len=12      4.86      4.1±0.00ns    1.00      0.8±0.00ns
SmolStr::clone len=50      1.00      2.8±0.02ns    2.05      5.8±0.01ns
SmolStr::clone len=1000    1.00      2.8±0.01ns    2.04      5.8±0.02ns

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 11, 2025
@alexheretic
Copy link
Contributor Author

cc @Veykril wdyt?

@ChayimFriedman2
Copy link
Contributor

That's interesting. Our most of our SmolStrs inline or not? Can you check that, e.g. by adding a static counter in the constructor?

@Veykril
Copy link
Member

Veykril commented Nov 20, 2025

Our main use of this should be in rowan which I think uses one of these per Token right? I'd imagine the inline variant would be more beneficial for us generally? But agree, we should see the impact this has for us for the time being

@ChayimFriedman2
Copy link
Contributor

An optimization I thought about in the past for rowan is to store the whole file content in one buffer, then refer to it. But this will wait until we get rid of mutable syntax trees.

@rustbot rustbot removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 20, 2025
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Nov 20, 2025
@Veykril
Copy link
Member

Veykril commented Nov 20, 2025

Yea generally I don't mind this optimization I think, but only if it doesn't affect us much while we are still depending on it via rowan. Though alternatively we can just not bump the dep I guess

@alexheretic
Copy link
Contributor Author

From my pov using this lib externally I think favouring inline perf this way makes sense. Heap perf is still decent & O(1). One could perhaps even argue that clone perf for small SmolStrs being slower than larger ones was a surprise which this addresses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants