Sema: Improve comptime arithmetic undef handling #24674

Justus2308 · 2025-08-03T12:46:52Z

This commit expands on the foundations laid by #23177
and moves even more Sema-only functionality from Value
to Sema.arith. Specifically all shift and bitwise operations,
@truncate, @bitReverse and @byteSwap have been moved and
adapted to the new rules around undefined.

Especially the comptime shift operations have been basically
rewritten, fixing many open issues in the process.

New rules applied to operators:

<<, @shlExact, @shlWithOverflow, >>, @shrExact: compile error if any operand is undef
<<|, ~, ^, @truncate, @bitReverse, @byteSwap: return undef if any operand is undef
&, |: Return undef if both operands are undef, turn undef into actual 0xAA bytes otherwise

Additionally this commit canonicalizes the representation of
aggregates with all-undefined members in the InternPool by
disallowing them and enforcing the usage of a single typed
undef value instead. This reduces the amount of edge cases
and fixes a bunch of bugs related to partially undefined vecs.

List of operations directly affected by this patch:

<<, <<|, @shlExact, @shlWithOverflow
>>, @shrExact
&, |, ~, ^ and their atomic rmw + reduce pendants
@truncate, @bitReverse, @byteSwap

Resolves #16466 *
Resolves #21266
Resolves #21943
Resolves #23034
Resolves #24392

* (was already kind of resolved before this patch I think, but addresses the comment)

Most of the LOC in this patch are test cases, but if it's too big to review please let me know and I can try to split it into 2-3 smaller commits

This commit expands on the foundations laid by ziglang#23177 and moves even more `Sema`-only functionality from `Value` to `Sema.arith`. Specifically all shift and bitwise operations, `@truncate`, `@bitReverse` and `@byteSwap` have been moved and adapted to the new rules around `undefined`. Especially the comptime shift operations have been basically rewritten, fixing many open issues in the process. New rules applied to operators: * `<<`, `@shlExact`, `@shlWithOverflow`, `@shrExact`: compile error if any operator is undef * `<<|`, `>>`, all other affected ops: return undef if any operator is undef Additionally this commit canonicalizes the representation of aggregates with all-undefined members in the `InternPool` by disallowing them and enforcing the usage of a single typed `undef` value instead. This reduces the amount of edge cases and fixes a bunch of bugs related to partially undefined vecs. List of operations directly affected by this patch: * `<<`, `<<|`, `@shlExact`, `@shlWithOverflow` * `>>`, `@shrExact` * `&`, `|`, `~`, `^` and their atomic rmw pendants * `@truncate`, `@bitReverse`, `@byteSwap`

mlugg · 2025-08-06T22:46:51Z

Just based on the PR description: what's the rationale behind treating << and >> differently? The way I see it, both can exhibit IB, so both should have the "compile error on undef" rule. (If I said to do this elsewhere, which I cannot rule out because I say silly things sometimes, I... probably disagree with myself? :P)

Justus2308 · 2025-08-06T23:30:23Z

Oh I think you have a point - I for some reason only paid attention to the bits being shifted out, not the ones being shifted in, but if my mental model is correct this could also cause IB:

var x: u7 = 0;
x >>= 1; // bit #7 is undefined/not part of x, so could shift *in* a 1!

mlugg

This looks like great work, thank you! I have a few things I'd like you to look over, but this shouldn't be too far from mergeable.

src/InternPool.zig

src/Sema.zig

src/Sema/arith.zig

mlugg · 2025-08-06T23:55:55Z

src/Sema/arith.zig

+}
+/// Given an integer or boolean type, creates an value of that with the bit pattern 0xAA.
+/// This is used to convert undef values into 0xAA when performing e.g. bitwise operations.
+/// TODO: Eliminate this function and everything it stands for (related: #19634).


good comment

mlugg · 2025-08-06T23:59:26Z

test/cases/compile_errors/undef_arith_is_illegal.zig

 // :71:17: error: use of undefined value here causes illegal behavior
-// :71:17: error: use of undefined value here causes illegal behavior


It looks like a bunch of errors have disappeared from this test. Do you know why?

Good question - I checked and there are only 4256 errors now compared to 6688 before. Is there maybe some limit on the max line number somewhere? My patch does add lots of error notes for vector indices so maybe some limit is now hit earlier than before?
I'll investigate further

mlugg · 2025-08-07T00:05:18Z

test/cases/compile_errors/undef_arith_returns_undef.zig

 // @as(u8, undefined)
 // @as(u8, undefined)
 // @as(@Vector(2, u8), [runtime value])
 // @as(@Vector(2, u8), [runtime value])
+// @as(@Vector(2, u8), undefined)


Ahhh, these changes worried me at first, but actually, this is great! Your undef canonicalization has made .{ undefined, undefined } +% .{ runtime, runtime } return comptime-known undef. Lovely work!

Justus2308 · 2025-08-07T00:17:27Z

Thank you for the review! I've added the right shift compile error and a couple of test for now, I'll have a look at all of the other comments tomorrow

mlugg · 2025-08-07T01:26:52Z

Oh I think you have a point - I for some reason only paid attention to the bits being shifted out, not the ones being shifted in, but if my mental model is correct this could also cause IB:
var x: u7 = 0;
x >>= 1; // bit #7 is undefined/not part of x, so could shift *in* a 1!

I think you're misunderstanding where the IB comes from here. Shifting any well-defined u7 by 1 is legal, in either direction. The IB in << and >> comes from the fact that for a N-bit integer, it is only allowed to shift it by less than N bits. This is enforced by the type system to the fullest extent possible -- e.g. if the LHS is a u8, the RHS is forced to be a u3 -- but because we don't currently have #3806, the type system can't enforce it for weird LHS types like u7, so my_u7 << runtime_known(7) is Illegal Behavior.

Justus2308 · 2025-08-07T11:39:54Z

I left the aggregate interns that use byte storage and the ones in the type info logic alone for now. There are also about 100 aggregate interns left outside of Sema I'll deal with those later.

Justus2308 · 2025-08-07T19:27:29Z

Benchmarks performed on a MacBook Pro M1 (sadly no poop)

$ hyperfine --prepare 'rm -rf .zig-cache ~/.cache/zig' 'build/release-master/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig' 'build/release-feature/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig'

master:

  Time (mean ± σ):     26.002 s ±  0.424 s    [User: 25.186 s, System: 1.369 s]
  Range (min … max):   25.744 s … 27.158 s    10 runs

  Time (mean ± σ):     25.932 s ±  0.213 s    [User: 25.162 s, System: 1.367 s]
  Range (min … max):   25.733 s … 26.463 s    10 runs

my branch:

  Time (mean ± σ):     25.796 s ±  0.194 s    [User: 25.016 s, System: 1.372 s]
  Range (min … max):   25.570 s … 26.301 s    10 runs

  Time (mean ± σ):     25.697 s ±  0.161 s    [User: 24.986 s, System: 1.337 s]
  Range (min … max):   25.459 s … 26.072 s    10 runs

So no performance impact or even slightly faster :)

Summary
  build/release-feature/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig ran
    1.01 ± 0.01 times faster than build/release-master/bin/zig test --test-no-exec -femit-bin=t --zig-lib-dir lib lib/std/std.zig

Justus2308 · 2025-08-07T19:29:18Z

I haven't been able to find the cause for the test errors that disaperared yet. I checked all call sites of failWithUseOfUndef and couldn't find any obvious mistakes. Removing the error note for vectors also doesn't make a difference.
I suspect that it has something to do with isUndef now effectively being more rigorous because it catches vectors full of undef elems which it didn't do before but I'm not sure yet.

Justus2308 · 2025-08-07T19:34:41Z

The only remaining usages of pt.intern(.{ .aggregate = ... that don't use bytes storage are in the Sema typeInfo logic ~~and in x86_64 CodeGen~~. I don't think it makes sense to replace the usages in typeInfo since it only deals with data that's supplied by InternPool and not directly by the user so it's literally impossible to get any undefined values there. ~~And I didn't dare to touch the x86_64 stuff yet because I have no idea how it works :)~~

Justus2308 · 2025-08-07T21:48:36Z

Benchmarks for x86_64 backend:

$ hyperfine --prepare 'rm -rf .zig-cache ~/.cache/zig t' 'build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig' 'build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig'

Benchmark 1: build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig
  Time (mean ± σ):      9.502 s ±  0.219 s    [User: 14.115 s, System: 2.558 s]
  Range (min … max):    9.196 s …  9.851 s    10 runs

Benchmark 2: build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig
  Time (mean ± σ):      9.252 s ±  0.372 s    [User: 13.873 s, System: 2.437 s]
  Range (min … max):    8.784 s …  9.864 s    10 runs

Summary
  build/release-feature/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig ran
    1.03 ± 0.05 times faster than build/release-master/bin/zig test --test-no-exec -femit-bin=t -fno-llvm -target x86_64-linux-musl --zig-lib-dir lib lib/std/std.zig

Wow I wasn't aware that the x86_64 backend is that much faster than llvm (unless I did something wrong?) can't wait for the aarch64 backend to be ready!

Justus2308 mentioned this pull request Aug 3, 2025

Fix Sema unreachable when shfting by vector containing undef elems #24396

Closed

alexrp requested a review from mlugg August 3, 2025 12:49

Justus2308 force-pushed the undef-shift-bitwise branch from 4d42e8e to 277ad41 Compare August 3, 2025 13:13

Justus2308 force-pushed the undef-shift-bitwise branch from 277ad41 to 073f1d5 Compare August 4, 2025 08:36

mlugg requested changes Aug 7, 2025

View reviewed changes

make >> a compile error with any undef arg ; add a bunch of test cases

66b6a55

Justus2308 added 2 commits August 7, 2025 13:11

address most comments

d9f7739

Sema: replace most aggregate interns with pt.aggregateValue

5351e61

replace even more aggregate interns

4e173b3

replace most aggregate interns in x86_64/CodeGen

64f8a57

		// :71:17: error: use of undefined value here causes illegal behavior
		// :71:17: error: use of undefined value here causes illegal behavior

Uh oh!

Sema: Improve comptime arithmetic undef handling #24674

Are you sure you want to change the base?

Sema: Improve comptime arithmetic undef handling #24674

Conversation

Justus2308 commented Aug 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlugg commented Aug 6, 2025

Uh oh!

Justus2308 commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlugg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mlugg Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

mlugg Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

Justus2308 Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mlugg Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

Justus2308 commented Aug 7, 2025

Uh oh!

mlugg commented Aug 7, 2025

Uh oh!

Justus2308 commented Aug 7, 2025

Uh oh!

Justus2308 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Justus2308 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Justus2308 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Justus2308 commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Justus2308 commented Aug 3, 2025 •

edited

Loading

Justus2308 commented Aug 6, 2025 •

edited

Loading

Justus2308 Aug 7, 2025 •

edited

Loading

Justus2308 commented Aug 7, 2025 •

edited

Loading

Justus2308 commented Aug 7, 2025 •

edited

Loading

Justus2308 commented Aug 7, 2025 •

edited

Loading

Justus2308 commented Aug 7, 2025 •

edited

Loading