Skip to content

Sema: Improve comptime arithmetic undef handling #24674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Justus2308
Copy link
Contributor

@Justus2308 Justus2308 commented Aug 3, 2025

This commit expands on the foundations laid by #23177
and moves even more Sema-only functionality from Value
to Sema.arith. Specifically all shift and bitwise operations,
@truncate, @bitReverse and @byteSwap have been moved and
adapted to the new rules around undefined.

Especially the comptime shift operations have been basically
rewritten, fixing many open issues in the process.

New rules applied to operators:

  • <<, @shlExact, @shlWithOverflow, >>, @shrExact: compile error if any operand is undef
  • <<|, ~, ^, @truncate, @bitReverse, @byteSwap: return undef if any operand is undef
  • &, |: Return undef if both operands are undef, turn undef into actual 0xAA bytes otherwise

Additionally this commit canonicalizes the representation of
aggregates with all-undefined members in the InternPool by
disallowing them and enforcing the usage of a single typed
undef value instead. This reduces the amount of edge cases
and fixes a bunch of bugs related to partially undefined vecs.

List of operations directly affected by this patch:

  • <<, <<|, @shlExact, @shlWithOverflow
  • >>, @shrExact
  • &, |, ~, ^ and their atomic rmw + reduce pendants
  • @truncate, @bitReverse, @byteSwap

Resolves #16466 *
Resolves #21266
Resolves #21943
Resolves #23034
Resolves #24392

* (was already kind of resolved before this patch I think, but addresses the comment)

Most of the LOC in this patch are test cases, but if it's too big to review please let me know and I can try to split it into 2-3 smaller commits

This commit expands on the foundations laid by ziglang#23177
and moves even more `Sema`-only functionality from `Value`
to `Sema.arith`. Specifically all shift and bitwise operations,
`@truncate`, `@bitReverse` and `@byteSwap` have been moved and
adapted to the new rules around `undefined`.

Especially the comptime shift operations have been basically
rewritten, fixing many open issues in the process.

New rules applied to operators:
* `<<`, `@shlExact`, `@shlWithOverflow`, `@shrExact`: compile error if any operator is undef
* `<<|`, `>>`, all other affected ops: return undef if any operator is undef

Additionally this commit canonicalizes the representation of
aggregates with all-undefined members in the `InternPool` by
disallowing them and enforcing the usage of a single typed
`undef` value instead. This reduces the amount of edge cases
and fixes a bunch of bugs related to partially undefined vecs.

List of operations directly affected by this patch:
* `<<`, `<<|`, `@shlExact`, `@shlWithOverflow`
* `>>`, `@shrExact`
* `&`, `|`, `~`, `^` and their atomic rmw pendants
* `@truncate`, `@bitReverse`, `@byteSwap`
@Justus2308 Justus2308 force-pushed the undef-shift-bitwise branch from 277ad41 to 073f1d5 Compare August 4, 2025 08:36
@mlugg
Copy link
Member

mlugg commented Aug 6, 2025

Just based on the PR description: what's the rationale behind treating << and >> differently? The way I see it, both can exhibit IB, so both should have the "compile error on undef" rule. (If I said to do this elsewhere, which I cannot rule out because I say silly things sometimes, I... probably disagree with myself? :P)

@Justus2308
Copy link
Contributor Author

Justus2308 commented Aug 6, 2025

Oh I think you have a point - I for some reason only paid attention to the bits being shifted out, not the ones being shifted in, but if my mental model is correct this could also cause IB:

var x: u7 = 0;
x >>= 1; // bit #7 is undefined/not part of x, so could shift *in* a 1!

Copy link
Member

@mlugg mlugg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like great work, thank you! I have a few things I'd like you to look over, but this shouldn't be too far from mergeable.

}
/// Given an integer or boolean type, creates an value of that with the bit pattern 0xAA.
/// This is used to convert undef values into 0xAA when performing e.g. bitwise operations.
/// TODO: Eliminate this function and everything it stands for (related: #19634).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good comment

// :71:17: error: use of undefined value here causes illegal behavior
// :71:17: error: use of undefined value here causes illegal behavior
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like a bunch of errors have disappeared from this test. Do you know why?

Copy link
Contributor Author

@Justus2308 Justus2308 Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question - I checked and there are only 4256 errors now compared to 6688 before. Is there maybe some limit on the max line number somewhere? My patch does add lots of error notes for vector indices so maybe some limit is now hit earlier than before?
I'll investigate further

// @as(u8, undefined)
// @as(u8, undefined)
// @as(@Vector(2, u8), [runtime value])
// @as(@Vector(2, u8), [runtime value])
// @as(@Vector(2, u8), undefined)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, these changes worried me at first, but actually, this is great! Your undef canonicalization has made .{ undefined, undefined } +% .{ runtime, runtime } return comptime-known undef. Lovely work!

@Justus2308
Copy link
Contributor Author

Thank you for the review! I've added the right shift compile error and a couple of test for now, I'll have a look at all of the other comments tomorrow

@mlugg
Copy link
Member

mlugg commented Aug 7, 2025

Oh I think you have a point - I for some reason only paid attention to the bits being shifted out, not the ones being shifted in, but if my mental model is correct this could also cause IB:

var x: u7 = 0;
x >>= 1; // bit #7 is undefined/not part of x, so could shift *in* a 1!

I think you're misunderstanding where the IB comes from here. Shifting any well-defined u7 by 1 is legal, in either direction. The IB in << and >> comes from the fact that for a N-bit integer, it is only allowed to shift it by less than N bits. This is enforced by the type system to the fullest extent possible -- e.g. if the LHS is a u8, the RHS is forced to be a u3 -- but because we don't currently have #3806, the type system can't enforce it for weird LHS types like u7, so my_u7 << runtime_known(7) is Illegal Behavior.

@Justus2308
Copy link
Contributor Author

I left the aggregate interns that use byte storage and the ones in the type info logic alone for now. There are also about 100 aggregate interns left outside of Sema I'll deal with those later.

@Justus2308
Copy link
Contributor Author

Justus2308 commented Aug 7, 2025

Benchmarks performed on a MacBook Pro M1 (sadly no poop)

$ hyperfine --prepare 'rm -rf .zig-cache ~/.cache/zig' 'build/release-master/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig' 'build/release-feature/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig'

master:

  Time (mean ± σ):     26.002 s ±  0.424 s    [User: 25.186 s, System: 1.369 s]
  Range (min … max):   25.744 s … 27.158 s    10 runs
  Time (mean ± σ):     25.932 s ±  0.213 s    [User: 25.162 s, System: 1.367 s]
  Range (min … max):   25.733 s … 26.463 s    10 runs

my branch:

  Time (mean ± σ):     25.796 s ±  0.194 s    [User: 25.016 s, System: 1.372 s]
  Range (min … max):   25.570 s … 26.301 s    10 runs
  Time (mean ± σ):     25.697 s ±  0.161 s    [User: 24.986 s, System: 1.337 s]
  Range (min … max):   25.459 s … 26.072 s    10 runs

So no performance impact or even slightly faster :)

Summary
  build/release-feature/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig ran
    1.01 ± 0.01 times faster than build/release-master/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig

@Justus2308
Copy link
Contributor Author

Justus2308 commented Aug 7, 2025

I haven't been able to find the cause for the test errors that disaperared yet. I checked all call sites of failWithUseOfUndef and couldn't find any obvious mistakes. Removing the error note for vectors also doesn't make a difference.
I suspect that it has something to do with isUndef now effectively being more rigorous because it catches vectors full of undef elems which it didn't do before but I'm not sure yet.

@Justus2308
Copy link
Contributor Author

Justus2308 commented Aug 7, 2025

The only remaining usages of pt.intern(.{ .aggregate = ... that don't use bytes storage are in the Sema typeInfo logic and in x86_64 CodeGen. I don't think it makes sense to replace the usages in typeInfo since it only deals with data that's supplied by InternPool and not directly by the user so it's literally impossible to get any undefined values there. And I didn't dare to touch the x86_64 stuff yet because I have no idea how it works :)

@Justus2308
Copy link
Contributor Author

Justus2308 commented Aug 7, 2025

Benchmarks for x86_64 backend:

$ hyperfine --prepare 'rm -rf .zig-cache ~/.cache/zig t' 'build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig' 'build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig'
Benchmark 1: build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig
  Time (mean ± σ):      9.502 s ±  0.219 s    [User: 14.115 s, System: 2.558 s]
  Range (min … max):    9.196 s …  9.851 s    10 runs

Benchmark 2: build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig
  Time (mean ± σ):      9.252 s ±  0.372 s    [User: 13.873 s, System: 2.437 s]
  Range (min … max):    8.784 s …  9.864 s    10 runs

Summary
  build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig ran
    1.03 ± 0.05 times faster than build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig

Wow I wasn't aware that the x86_64 backend is that much faster than llvm (unless I did something wrong?) can't wait for the aarch64 backend to be ready!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants