From 3ab0513cb4775ba5ee826ff6b893d5935c7dcb42 Mon Sep 17 00:00:00 2001 From: Christopher Durham Date: Mon, 27 Jun 2022 18:46:23 -0400 Subject: [PATCH] Clarify String::from_utf8_unchecked's invariants This is the same clarification as in b92cd1a32c842e82575e59374545dda5f9b9f77a. Strictly speaking, String's buffer being valid UTF-8 is a *safety* invariant, not a *validity* invariant. This means that String managing a non-UTF-8 buffer is not an AM violation / UB, but can result in safe API surface causing an AM violation / UB. Currently, *no* String functionality, including Drop::drop, say they are valid on invalid UTF-8. As such, the only thing possible to do with an unsafe String is forget to drop it. Everything else is library UB. Making valid UTF-8 a precondition of from_utf8_unchecked then is an API simplification. Additionally, this makes from_utf8(bytes).unwrap() a valid sanitizing implementation. --- library/alloc/src/string.rs | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/library/alloc/src/string.rs b/library/alloc/src/string.rs index 8883880726594..3855f5cfc298d 100644 --- a/library/alloc/src/string.rs +++ b/library/alloc/src/string.rs @@ -819,10 +819,7 @@ impl String { /// /// # Safety /// - /// This function is unsafe because it does not check that the bytes passed - /// to it are valid UTF-8. If this constraint is violated, it may cause - /// memory unsafety issues with future users of the `String`, as the rest of - /// the standard library assumes that `String`s are valid UTF-8. + /// The provided bytes must be valid UTF-8. /// /// # Examples ///