From 62b3b40ade206830c502c394a9f547a300057ac0 Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Thu, 11 Feb 2016 12:23:06 +0530 Subject: [PATCH 1/6] Clarified move semantics in "the details" section. --- src/doc/book/ownership.md | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index a62d31d362b14..585e337ccc2e6 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -124,19 +124,39 @@ special annotation here, it’s the default thing that Rust does. The reason that we cannot use a binding after we’ve moved it is subtle, but important. When we write code like this: +```rust +let x = 10; +``` + +Rust allocates memory for an integer [i32] on the [stack][sh], copies the bit +pattern representing the value of 10 to the allocated memory and binds the +variable name x to this memory region for future reference. + +Now consider the following code fragment: + ```rust let v = vec![1, 2, 3]; let v2 = v; ``` -The first line allocates memory for the vector object, `v`, and for the data it -contains. The vector object is stored on the [stack][sh] and contains a pointer -to the content (`[1, 2, 3]`) stored on the [heap][sh]. When we move `v` to `v2`, -it creates a copy of that pointer, for `v2`. Which means that there would be two -pointers to the content of the vector on the heap. It would violate Rust’s -safety guarantees by introducing a data race. Therefore, Rust forbids using `v` -after we’ve done the move. +The first line allocates memory for the vector object, `v`, on the stack like +it does for `x` above. But in addition to that it also allocates some memory +on on the [heap][sh] for the actual data `[1, 2, 3]`. Rust copies the address +of this heap allocation to an internal pointer part of the vector object +placed on the stack (let's call it the data pointer). It is worth pointing out +even at the risk of being redundant that the vector object and its data live +in separate memory regions instead of being a single contiguous memory +allocation (due to reasons we will not go into at this point of time). + +When we move `v` to `v2`, rust actually does a bitwise copy of the vector +object `v` into the stack allocation represented by `v2`. This shallow copy +does not create a copy of the heap allocation containing the actual data. +Which means that there would be two pointers to the contents of the vector +both pointing to the same memory allocation on the heap. It would violate +Rust’s safety guarantees by introducing a data race if one could access both +`v` and `v2` at the same time. Therefore, Rust forbids using `v` after we’ve +done the move (shallow copy). [sh]: the-stack-and-the-heap.html From 50d179e0624b74a68982d7002a497a7a3403d360 Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Thu, 11 Feb 2016 19:40:19 +0530 Subject: [PATCH 2/6] Explained the data race with an example. --- src/doc/book/ownership.md | 40 +++++++++++++++++++++++++++++---------- 1 file changed, 30 insertions(+), 10 deletions(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index 585e337ccc2e6..5c603582bfc67 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -122,7 +122,9 @@ special annotation here, it’s the default thing that Rust does. ## The details The reason that we cannot use a binding after we’ve moved it is subtle, but -important. When we write code like this: +important. + +When we write code like this: ```rust let x = 10; @@ -140,14 +142,18 @@ let v = vec![1, 2, 3]; let v2 = v; ``` -The first line allocates memory for the vector object, `v`, on the stack like +The first line allocates memory for the vector object `v` on the stack like it does for `x` above. But in addition to that it also allocates some memory -on on the [heap][sh] for the actual data `[1, 2, 3]`. Rust copies the address -of this heap allocation to an internal pointer part of the vector object -placed on the stack (let's call it the data pointer). It is worth pointing out -even at the risk of being redundant that the vector object and its data live -in separate memory regions instead of being a single contiguous memory -allocation (due to reasons we will not go into at this point of time). +on the [heap][sh] for the actual data (`[1, 2, 3]`). Rust copies the address +of this heap allocation to an internal pointer, which is part of the vector +object placed on the stack (let's call it the data pointer). + +It is worth pointing out (even at the risk of repeating things) that the vector +object and its data live in separate memory regions instead of being a single +contiguous memory allocation (due to reasons we will not go into at this point +of time). These two parts of the vector (the one on the stack and one on the +heap) must agree with each other at all times with regards to things like the +length, capacity etc. When we move `v` to `v2`, rust actually does a bitwise copy of the vector object `v` into the stack allocation represented by `v2`. This shallow copy @@ -155,8 +161,22 @@ does not create a copy of the heap allocation containing the actual data. Which means that there would be two pointers to the contents of the vector both pointing to the same memory allocation on the heap. It would violate Rust’s safety guarantees by introducing a data race if one could access both -`v` and `v2` at the same time. Therefore, Rust forbids using `v` after we’ve -done the move (shallow copy). +`v` and `v2` at the same time. + +For example if we truncated the vector to just two elements through `v2`: + +```rust +v2.truncate(2); +``` + +and `v1` were still accessible we'd end up with an invalid vector since it +would not know that the heap data has been truncated. Now, the part of the +vector `v1` on the stack does not agree with its corresponding part on the +heap. `v1` still thinks there are three elements in the vector and will +happily let us access the non existent element `v1[2]` but as you might +already know this is a recipe for disaster. + +This is why Rust forbids using `v` after we’ve done the move. [sh]: the-stack-and-the-heap.html From a8fd1bbd2f9511e9394fed1112c4ada186eb1b00 Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Thu, 11 Feb 2016 19:44:33 +0530 Subject: [PATCH 3/6] Minor change. --- src/doc/book/ownership.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index 5c603582bfc67..63c0671e6595e 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -171,10 +171,10 @@ v2.truncate(2); and `v1` were still accessible we'd end up with an invalid vector since it would not know that the heap data has been truncated. Now, the part of the -vector `v1` on the stack does not agree with its corresponding part on the +vector `v1` on the stack does not agree with the corresponding part on the heap. `v1` still thinks there are three elements in the vector and will happily let us access the non existent element `v1[2]` but as you might -already know this is a recipe for disaster. +already know this is a recipe for disaster (might lead to a segfault). This is why Rust forbids using `v` after we’ve done the move. From a6fedc85bfec568a404c4a78d9c7faef12937695 Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Thu, 11 Feb 2016 19:55:45 +0530 Subject: [PATCH 4/6] Minor change. --- src/doc/book/ownership.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index 63c0671e6595e..cf6a43b1f1334 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -148,12 +148,12 @@ on the [heap][sh] for the actual data (`[1, 2, 3]`). Rust copies the address of this heap allocation to an internal pointer, which is part of the vector object placed on the stack (let's call it the data pointer). -It is worth pointing out (even at the risk of repeating things) that the vector -object and its data live in separate memory regions instead of being a single -contiguous memory allocation (due to reasons we will not go into at this point -of time). These two parts of the vector (the one on the stack and one on the -heap) must agree with each other at all times with regards to things like the -length, capacity etc. +It is worth pointing out (even at the risk of stating the obvious) that the +vector object and its data live in separate memory regions instead of being a +single contiguous memory allocation (due to reasons we will not go into at +this point of time). These two parts of the vector (the one on the stack and +one on the heap) must agree with each other at all times with regards to +things like the length, capacity etc. When we move `v` to `v2`, rust actually does a bitwise copy of the vector object `v` into the stack allocation represented by `v2`. This shallow copy @@ -169,7 +169,7 @@ For example if we truncated the vector to just two elements through `v2`: v2.truncate(2); ``` -and `v1` were still accessible we'd end up with an invalid vector since it +and `v1` were still accessible we'd end up with an invalid vector since `v1` would not know that the heap data has been truncated. Now, the part of the vector `v1` on the stack does not agree with the corresponding part on the heap. `v1` still thinks there are three elements in the vector and will From 37a952a672ca79bfcc41795e378a710c196a557a Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Sat, 13 Feb 2016 18:40:24 +0530 Subject: [PATCH 5/6] Fixed build error as per steveklabnik's suggestion and expanded on the ills of doing out of bounds accesses. --- src/doc/book/ownership.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index cf6a43b1f1334..3d67e20388bcc 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -166,6 +166,8 @@ Rust’s safety guarantees by introducing a data race if one could access both For example if we truncated the vector to just two elements through `v2`: ```rust +# let v = vec![1, 2, 3]; +# let v2 = v; v2.truncate(2); ``` @@ -174,7 +176,9 @@ would not know that the heap data has been truncated. Now, the part of the vector `v1` on the stack does not agree with the corresponding part on the heap. `v1` still thinks there are three elements in the vector and will happily let us access the non existent element `v1[2]` but as you might -already know this is a recipe for disaster (might lead to a segfault). +already know this is a recipe for disaster. Especially because it might lead +to a segmentation fault or worse allow an unauthorized user to read from +memory to which they don't have access. This is why Rust forbids using `v` after we’ve done the move. From 1536195ce66dcf764782e1f36ced4aa5eefef321 Mon Sep 17 00:00:00 2001 From: Sandeep Datta Date: Wed, 17 Feb 2016 20:47:24 +0530 Subject: [PATCH 6/6] Made v2 mutable so that we can actually truncate it. --- src/doc/book/ownership.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/doc/book/ownership.md b/src/doc/book/ownership.md index 3d67e20388bcc..ac6184013ee23 100644 --- a/src/doc/book/ownership.md +++ b/src/doc/book/ownership.md @@ -139,7 +139,7 @@ Now consider the following code fragment: ```rust let v = vec![1, 2, 3]; -let v2 = v; +let mut v2 = v; ``` The first line allocates memory for the vector object `v` on the stack like @@ -167,7 +167,7 @@ For example if we truncated the vector to just two elements through `v2`: ```rust # let v = vec![1, 2, 3]; -# let v2 = v; +# let mut v2 = v; v2.truncate(2); ```