-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Description
This proposal makes a distinction between the in-memory storage of a value and the storage of a value according to information theory.
In some cases, these are the same thing, such as with a u8 type. In other cases they can be different for various reasons, such as endianness or aggregate padding.
The idea here is that these two operations would both be well-defined, but would produce different results; they would not be defined in terms of each other:
byte casting:
fn byteCast(a: u32) f32 {
return @ptrCast(*f32, &a).*;
}bit casting:
fn bitCast(a: u32) f32 {
return @bitCast(f32, a);
}Byte casting is done via pointer reinterpretation, or via extern union field aliasing. Bit casting is done via the @bitCast language primitive. They are both well defined but distinct:
- byte casting - reinterprets memory that is stored in contiguous bytes from one type to another. This is affected by padding and endianness.
- bit casting - reinterprets bits according to information theory. Regardless of padding, alignment, or endianness, if the number of bits it takes to represent a type matches another, the value can be bitcasted from one to the other.
Byte casting is easy to understand; you can almost implement it by accident. Here we focus on bit casting and what it means. First, some prerequisites:
@bitSizeOfvs@sizeOf- sizeof corresponds to bytes. It takes into account padding. As an example,@sizeOf(u24) == 4. Meanwhile,@bitSizeOfcorresponds to information theory. In this example,@bitSizeOf(u24) == 24. bit size ignores padding. The bit size of a struct, regardless of whether it is packed or extern or not, is the sum of@bitSizeOffor each field.@bitOffsetOfvs@byteOffsetOf- for bytes it points to the difference in memory address between a field and the base pointer. For bits it tells the number of lower bits that precede the field in a hypothetical integer with bits equal to the@bitSizeOfthe aggregate.
With this proposal, each type, regardless of whether it has a well-defined memory layout or not (which applies to bytes), it has a hypothetical integer with a number of bits equal to the @bitSizeOf that type. We call this integer the type's fundamental int. @bitCast is defined as follows:
- convert from the source type to the its fundamental int.
- convert from the fundamental int to the destination type.
Attempting to @bitCast between two types that have differing @bitSizeOf values is a compile error. Note that one can obtain the fundamental int for a type by bit casting the value to an unsigned integer.
The motivation for this proposal is:
- To complete the specification of how these bit related functions work.
- To make composing packed structs useful and making
align(0)useful in general. - To make it possible to optimize things such as
??u8, whose fundamental integer would be au10. - To have the protection of a type system but also allow Data-Oriented-Design tricks, storing information in compact ways.
With this proposal, one would be able to convert between structs, even though they have no well-defined byte representation, like this:
const std = @import("std");
const expect = std.testing.expect;
const S = struct {
name: []const u8,
ok: ?bool,
};
const Other = struct {
name_ptr: [*]const u8,
name_len: usize,
ok_present: u1,
ok_flag: u1,
};
test "example" {
var s = S{
.name = "hello",
.ok = true,
};
var other = @bitCast(Other, s);
try expect(std.mem.eql(u8, other.name_ptr[0..other.name_len], "hello"));
try expect(other.ok_present == 1);
try expect(other.ok_flag == 1);
}