From c995e11385136eb18a445254b4e91f05a4c98fc3 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:33:41 -0400 Subject: [PATCH 01/15] Non-nullable is common lingo, isn't it? --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 639d64adc18b8..974b5cf2ef96a 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -141,7 +141,7 @@ such a representation is inefficient. The classic case of this is Rust's by using null as a special value. The net result is that `size_of::>() == size_of::<&T>()` -There are many types in Rust that are, or contain, "not null" pointers such as +There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine nested enums pooling their tags into a single discriminant, as they are by definition known to have a limited range of valid values. In principle enums can From 345b34f7270bce4f3a1f80f0d4f0cb64b054be3e Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:34:10 -0400 Subject: [PATCH 02/15] Could means "in future", can is ambiguous. --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 974b5cf2ef96a..e267e306fc5ed 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -144,7 +144,7 @@ by using null as a special value. The net result is that There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine nested enums pooling their tags into a single discriminant, as they are by -definition known to have a limited range of valid values. In principle enums can +definition known to have a limited range of valid values. In principle enums could use fairly elaborate algorithms to cache bits throughout nested types with special constrained representations. As such it is *especially* desirable that we leave enum layout unspecified today. From db42f4fcd064e7a680dceffdc0b10eff1c932449 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:37:26 -0400 Subject: [PATCH 03/15] Make null pointer optimization bit more concrete. Saying it stores a discriminant bit inside the pointer is wrong, for the notion of inside that most people will assume (e.g.: which bit of the pointer is used?). And I may be totally totally wrong here but I was under the impression that it's a very narrow optimization that doesn't apply to anything beyond a pointer + a unit. Saying so explicitly seems much clearer to me and prevents wild imaginings about enums with 4 pointer variants and a bool variant or whatever. --- src/doc/tarpl/repr-rust.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index e267e306fc5ed..741f37b7caf6e 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -133,13 +133,15 @@ struct FooRepr { } ``` -And indeed this is approximately how it would be laid out in general -(modulo the size and position of `tag`). However there are several cases where -such a representation is inefficient. The classic case of this is Rust's -"null pointer optimization". Given a pointer that is known to not be null -(e.g. `&u32`), an enum can *store* a discriminant bit *inside* the pointer -by using null as a special value. The net result is that -`size_of::>() == size_of::<&T>()` +And indeed this is approximately how it would be laid out in general (modulo the +size and position of `tag`). + +However there are several cases where such a representation is inefficient. The +classic case of this is Rust's "null pointer optimization": an enum consisting +of a unit variant and a non-nullable pointer variant (e.g. `&u32`) makes the tag +unnecessary, because a null pointer value can safely be interpreted to mean that +the unit variant is chosen instead. The net result is that, for example, +`size_of::>() == size_of::<&T>()`. There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine From a7f4b80cd4c3043615b45df69843cadb627e7162 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:39:18 -0400 Subject: [PATCH 04/15] !contiguous. The prior example was explicitly *not* contiguous (e.g. the values all next to adjacent in memory), so that's confusing way of putting it. --- src/doc/tarpl/repr-rust.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 741f37b7caf6e..e9c2dfb9e5035 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -50,10 +50,10 @@ struct A { } ``` -There is *no indirection* for these types; all data is stored contiguously as -you would expect in C. However with the exception of arrays (which are densely -packed and in-order), the layout of data is not by default specified in Rust. -Given the two following struct definitions: +There is *no indirection* for these types; all data is stored within the struct, +as you would expect in C. However with the exception of arrays (which are +densely packed and in-order), the layout of data is not by default specified in +Rust. Given the two following struct definitions: ```rust struct A { From 728545118b2f598047f730a7a282d3b6a561e7e0 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:42:02 -0400 Subject: [PATCH 05/15] Phrasing. Assuming -- but who is assuming, and who is aligning? The architecture, so clarified that a bit. Also avoid run on adjectives, which read awkwardly. --- src/doc/tarpl/repr-rust.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index e9c2dfb9e5035..7e9ce4354061a 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -36,9 +36,9 @@ struct A { } ``` -will be 32-bit aligned assuming these primitives are aligned to their size. -It will therefore have a size that is a multiple of 32-bits. It will potentially -*really* become: +will be 32-bit aligned on an architecture that aligns these primitives to their +respective sizes. The whole struct will therefore have a size that is a multiple +of 32-bits. It will potentially become: ```rust struct A { From 13bd820de148a58a9222e56fa9bfb72f4df4a27b Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:43:09 -0400 Subject: [PATCH 06/15] Rename B.x back to B.a. Reduces the number of things going on to focus on what matters, which is presumably that the two identical structs A and B aren't guarenteed to have the same layout. --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 7e9ce4354061a..7cd97765ee6a1 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -62,7 +62,7 @@ struct A { } struct B { - x: i32, + a: i32, b: u64, } ``` From a1ebfccf002fcd62c9898dcce1bda48e5d34baae Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:45:59 -0400 Subject: [PATCH 07/15] Removed nonsensical phrase. If you read this sentence carefully, the last phrase doesn't make sense, because "not being guarenteed" is not a reason why they explicitly wouldn't. Unless the goal of the compiler is to be maximally confusing. Also, this example isn't nonsensical, it's just pedantic (because the layouts of A and B *are* the same, though they're not guarenteed to be), but it leads to this more important point about layout flexibility. Saying nonsensical makes it sound like the text itself is nonsensical. --- src/doc/tarpl/repr-rust.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 7cd97765ee6a1..a6fc49ca44f2d 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -68,12 +68,11 @@ struct B { ``` Rust *does* guarantee that two instances of A have their data laid out in -exactly the same way. However Rust *does not* guarantee that an instance of A -has the same field ordering or padding as an instance of B (in practice there's -no *particular* reason why they wouldn't, other than that its not currently -guaranteed). +exactly the same way. However Rust *does not* currently guarantee that an +instance of A has the same field ordering or padding as an instance of B, though +in practice there's no reason why they wouldn't. -With A and B as written, this is basically nonsensical, but several other +With A and B as written, this point would seem to be pedantic, but several other features of Rust make it desirable for the language to play with data layout in complex ways. From f4168510c8e99d8791f1b25c9830dca221208337 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:33:41 -0400 Subject: [PATCH 08/15] Non-nullable is common lingo, isn't it? --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index c8a372be7678b..9154064c5c429 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -141,7 +141,7 @@ such a representation is inefficient. The classic case of this is Rust's by using null as a special value. The net result is that `size_of::>() == size_of::<&T>()` -There are many types in Rust that are, or contain, "not null" pointers such as +There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine nested enums pooling their tags into a single discriminant, as they are by definition known to have a limited range of valid values. In principle enums can From 176bdd5e1f87692e68098e66e28a5a9b8b82debc Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:34:10 -0400 Subject: [PATCH 09/15] Could means "in future", can is ambiguous. --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 9154064c5c429..f1058ed9e38c4 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -144,7 +144,7 @@ by using null as a special value. The net result is that There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine nested enums pooling their tags into a single discriminant, as they are by -definition known to have a limited range of valid values. In principle enums can +definition known to have a limited range of valid values. In principle enums could use fairly elaborate algorithms to cache bits throughout nested types with special constrained representations. As such it is *especially* desirable that we leave enum layout unspecified today. From 9e76e68c1d560aba1792894838fc15507d72377d Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:37:26 -0400 Subject: [PATCH 10/15] Make null pointer optimization bit more concrete. Saying it stores a discriminant bit inside the pointer is wrong, for the notion of inside that most people will assume (e.g.: which bit of the pointer is used?). And I may be totally totally wrong here but I was under the impression that it's a very narrow optimization that doesn't apply to anything beyond a pointer + a unit. Saying so explicitly seems much clearer to me and prevents wild imaginings about enums with 4 pointer variants and a bool variant or whatever. --- src/doc/tarpl/repr-rust.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index f1058ed9e38c4..229dde528a9ec 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -133,13 +133,15 @@ struct FooRepr { } ``` -And indeed this is approximately how it would be laid out in general -(modulo the size and position of `tag`). However there are several cases where -such a representation is inefficient. The classic case of this is Rust's -"null pointer optimization". Given a pointer that is known to not be null -(e.g. `&u32`), an enum can *store* a discriminant bit *inside* the pointer -by using null as a special value. The net result is that -`size_of::>() == size_of::<&T>()` +And indeed this is approximately how it would be laid out in general (modulo the +size and position of `tag`). + +However there are several cases where such a representation is inefficient. The +classic case of this is Rust's "null pointer optimization": an enum consisting +of a unit variant and a non-nullable pointer variant (e.g. `&u32`) makes the tag +unnecessary, because a null pointer value can safely be interpreted to mean that +the unit variant is chosen instead. The net result is that, for example, +`size_of::>() == size_of::<&T>()`. There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine From a0dda5967c5490153234cd9369d15c5fd20ae43f Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:39:18 -0400 Subject: [PATCH 11/15] !contiguous. The prior example was explicitly *not* contiguous (e.g. the values all next to adjacent in memory), so that's confusing way of putting it. --- src/doc/tarpl/repr-rust.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 229dde528a9ec..798fbb4883608 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -50,10 +50,10 @@ struct A { } ``` -There is *no indirection* for these types; all data is stored contiguously as -you would expect in C. However with the exception of arrays (which are densely -packed and in-order), the layout of data is not by default specified in Rust. -Given the two following struct definitions: +There is *no indirection* for these types; all data is stored within the struct, +as you would expect in C. However with the exception of arrays (which are +densely packed and in-order), the layout of data is not by default specified in +Rust. Given the two following struct definitions: ```rust struct A { From 735028e885eac616456a4289f3cc7f93dd99d991 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:42:02 -0400 Subject: [PATCH 12/15] Phrasing. Assuming -- but who is assuming, and who is aligning? The architecture, so clarified that a bit. Also avoid run on adjectives, which read awkwardly. --- src/doc/tarpl/repr-rust.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 798fbb4883608..2470fe47f5db9 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -36,9 +36,9 @@ struct A { } ``` -will be 32-bit aligned assuming these primitives are aligned to their size. -It will therefore have a size that is a multiple of 32-bits. It will potentially -*really* become: +will be 32-bit aligned on an architecture that aligns these primitives to their +respective sizes. The whole struct will therefore have a size that is a multiple +of 32-bits. It will potentially become: ```rust struct A { From afc7b174285dd7e88372d3043d87f279abbb4b26 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:43:09 -0400 Subject: [PATCH 13/15] Rename B.x back to B.a. Reduces the number of things going on to focus on what matters, which is presumably that the two identical structs A and B aren't guarenteed to have the same layout. --- src/doc/tarpl/repr-rust.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index 2470fe47f5db9..b6a7cd36c1ebe 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -62,7 +62,7 @@ struct A { } struct B { - x: i32, + a: i32, b: u64, } ``` From 00ab6f66e54ca9aafe4984c48a86c82f46b8d601 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Fri, 31 Jul 2015 00:45:59 -0400 Subject: [PATCH 14/15] Removed nonsensical phrase. If you read this sentence carefully, the last phrase doesn't make sense, because "not being guarenteed" is not a reason why they explicitly wouldn't. Unless the goal of the compiler is to be maximally confusing. Also, this example isn't nonsensical, it's just pedantic (because the layouts of A and B *are* the same, though they're not guarenteed to be), but it leads to this more important point about layout flexibility. Saying nonsensical makes it sound like the text itself is nonsensical. --- src/doc/tarpl/repr-rust.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index b6a7cd36c1ebe..c399cb816a8df 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -68,12 +68,11 @@ struct B { ``` Rust *does* guarantee that two instances of A have their data laid out in -exactly the same way. However Rust *does not* guarantee that an instance of A -has the same field ordering or padding as an instance of B (in practice there's -no particular reason why they wouldn't, other than that its not currently -guaranteed). +exactly the same way. However Rust *does not* currently guarantee that an +instance of A has the same field ordering or padding as an instance of B, though +in practice there's no reason why they wouldn't. -With A and B as written, this is basically nonsensical, but several other +With A and B as written, this point would seem to be pedantic, but several other features of Rust make it desirable for the language to play with data layout in complex ways. From a0943de67445c27319f4f30f1fc8af24948d50a2 Mon Sep 17 00:00:00 2001 From: Taliesin Beynon Date: Mon, 3 Aug 2015 05:59:42 -0400 Subject: [PATCH 15/15] Clarify that null pointer optimization can burrow. --- src/doc/tarpl/repr-rust.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/doc/tarpl/repr-rust.md b/src/doc/tarpl/repr-rust.md index c399cb816a8df..a5c4f0c8767ad 100644 --- a/src/doc/tarpl/repr-rust.md +++ b/src/doc/tarpl/repr-rust.md @@ -137,10 +137,11 @@ size and position of `tag`). However there are several cases where such a representation is inefficient. The classic case of this is Rust's "null pointer optimization": an enum consisting -of a unit variant and a non-nullable pointer variant (e.g. `&u32`) makes the tag -unnecessary, because a null pointer value can safely be interpreted to mean that -the unit variant is chosen instead. The net result is that, for example, -`size_of::>() == size_of::<&T>()`. +of a single outer unit variant (e.g. `None`) and a (potentially nested) non- +nullable pointer variant (e.g. `&T`) makes the tag unnecessary, because a null +pointer value can safely be interpreted tos mean that the unit variant is chosen +instead. The net result is that, for example, `size_of::>() == +size_of::<&T>()`. There are many types in Rust that are, or contain, non-nullable pointers such as `Box`, `Vec`, `String`, `&T`, and `&mut T`. Similarly, one can imagine