From 47b14e308dd6dfc6b6acbb1a8bc14fb87d75658e Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Wed, 14 Jun 2023 18:51:30 +0200 Subject: [PATCH 1/3] Editorial: account for SharedArrayBuffer change in Web IDL See https://github.com/whatwg/webidl/pull/1311 for context. --- encoding.bs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/encoding.bs b/encoding.bs index 413298b..ba956f2 100644 --- a/encoding.bs +++ b/encoding.bs @@ -1341,7 +1341,7 @@ dictionary TextDecodeOptions { interface TextDecoder { constructor(optional DOMString label = "utf-8", optional TextDecoderOptions options = {}); - USVString decode(optional [AllowShared] BufferSource input, optional TextDecodeOptions options = {}); + USVString decode(optional AllowSharedBufferSource input, optional TextDecodeOptions options = {}); }; TextDecoder includes TextDecoderCommon; @@ -1695,7 +1695,7 @@ TextDecoderStream includes GenericTransformStream;
decoder . writable

Returns a writable stream which accepts - [AllowShared] BufferSource chunks and runs + AllowSharedBufferSource chunks and runs them through encoding's decoder before making them available to {{GenericTransformStream/readable}}. @@ -1758,7 +1758,7 @@ constructor steps are:

  1. Let bufferSource be the result of converting chunk to an - [AllowShared] BufferSource. + AllowSharedBufferSource.

  2. Push a copy of bufferSource to From 0a253c05cd13a174e10ef7fe5fc4b4b7bc1db7c5 Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Thu, 15 Jun 2023 18:34:39 +0200 Subject: [PATCH 2/3] CI --- encoding.bs | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/encoding.bs b/encoding.bs index ba956f2..2df6935 100644 --- a/encoding.bs +++ b/encoding.bs @@ -349,9 +349,9 @@ given an item item, encoding's

    Names and labels

    The table below lists all encodings -and their labels user agents must support. +and their labels user agents must support. User agents must not support any other encodings -or labels. +or labels.

    For each encoding, ASCII-lowercasing its name yields one of its labels. @@ -374,21 +374,20 @@ from a string label, run these steps:

  3. Remove any leading and trailing ASCII whitespace from label. -

  4. If label is an ASCII case-insensitive match for any of the labels - listed in the table below, then return the corresponding encoding; otherwise return - failure. +

  5. If label is an ASCII case-insensitive match for any of the labels listed + in the table below, then return the corresponding encoding; otherwise return failure.

-

This is a more basic and restrictive algorithm of mapping labels -to encodings than +

This is a more basic and restrictive algorithm of mapping labels to +encodings than section 1.4 of Unicode Technical Standard #22 prescribes, as that is necessary to be compatible with deployed content. - @@ -713,9 +712,8 @@ prescribes, as that is necessary to be compatible with deployed content.
Name - Labels + Name + Labels
The Encoding
"x-user-defined"
-

All encodings and their -labels are also available as non-normative -encodings.json resource. +

All encodings and their labels are also available as +non-normative encodings.json resource.

The set of supported encodings is primarily based on the intersection of the sets supported by major browser engines when the development of this From 940c7ca2f3f69b4bacd7cb4579de34aceb05550b Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Thu, 15 Jun 2023 18:43:06 +0200 Subject: [PATCH 3/3] CI + editorial --- encoding.bs | 47 ++++++++++++++++++++++------------------------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/encoding.bs b/encoding.bs index 2df6935..309bb41 100644 --- a/encoding.bs +++ b/encoding.bs @@ -356,14 +356,13 @@ or labels.

For each encoding, ASCII-lowercasing its name yields one of its labels. -

Authors must use the UTF-8 encoding and must use the -ASCII case-insensitive "utf-8" label to -identify it. - -

New protocols and formats, as well as existing formats deployed in new contexts, must -use the UTF-8 encoding exclusively. If these protocols and -formats need to expose the encoding's name or -label, they must expose it as "utf-8". +

Authors must use the UTF-8 encoding and must use its +(ASCII case-insensitive) "utf-8" label to identify it. + +

New protocols and formats, as well as existing formats deployed in new contexts, must use the +UTF-8 encoding exclusively. If these protocols and formats need to expose the +encoding's name or label, they must expose it +as "utf-8".

To @@ -378,7 +377,7 @@ from a string label, run these steps: in the table below, then return the corresponding encoding; otherwise return failure. -

This is a more basic and restrictive algorithm of mapping labels to +

This is a more basic and restrictive algorithm of mapping labels to encodings than section 1.4 of Unicode Technical Standard #22 prescribes, as that is necessary to be compatible with deployed content. @@ -1039,9 +1038,9 @@ optional I/O queue of bytes output (default « »), return the result

Standards are strongly discouraged from using decode, BOM sniff, and encode, except as needed for compatibility. Standards needing these legacy hooks will - most likely also need to use get an encoding (to turn a label into an - encoding) and get an output encoding (to turn an encoding into - another encoding that is suitable to pass into encode). + most likely also need to use get an encoding (to turn a label into an encoding) + and get an output encoding (to turn an encoding into another + encoding that is suitable to pass into encode).

For the extremely niche case of URL percent-encoding, custom encoder error handling is needed. The get an encoder and encode or fail algorithms are to be used for that. Other @@ -1352,10 +1351,8 @@ initially false.

decoder = new TextDecoder([label = "utf-8" [, options]])

Returns a new {{TextDecoder}} object. -

If label is either not a label or is a - label for replacement, - throws a - {{RangeError}}. +

If label is either not a label or is a label for + replacement, throws a {{RangeError}}.

decoder . encoding

Returns encoding's name, lowercased. @@ -1671,8 +1668,8 @@ TextDecoderStream includes GenericTransformStream; "utf-8" [, options]])

Returns a new {{TextDecoderStream}} object. -

If label is either not a label or is a label for replacement, - throws a {{RangeError}}. +

If label is either not a label or is a label for + replacement, throws a {{RangeError}}.

decoder . encoding

Returns encoding's name, lowercased. @@ -2028,9 +2025,9 @@ that are split between strings. [[!INFRA]]

UTF-8 decoder

-

A byte order mark has priority over a label as it has been found -to be more accurate in deployed content. Therefore it is not part of the UTF-8 decoder -algorithm but rather the decode and UTF-8 decode algorithms. +

A byte order mark has priority over a label as it has been found to be more accurate +in deployed content. Therefore it is not part of the UTF-8 decoder algorithm, but rather the +decode and UTF-8 decode algorithms.

UTF-8's decoder has an associated UTF-8 code point, UTF-8 bytes seen, and @@ -3251,9 +3248,9 @@ the server and the client.

shared UTF-16 decoder

-

A byte order mark has priority over a label as it -has been found to be more accurate in deployed content. Therefore it is not part of the -shared UTF-16 decoder algorithm but rather the decode algorithm. +

A byte order mark has priority over a label as it has been found to be more accurate +in deployed content. Therefore it is not part of the shared UTF-16 decoder algorithm, but +rather the decode algorithm.

shared UTF-16 decoder has an associated UTF-16 lead byte and UTF-16 lead surrogate (both initially null), and @@ -3329,7 +3326,7 @@ its is UTF-16BE decoder set to true.

UTF-16LE

-

"utf-16" is a label for UTF-16LE to deal with +

"utf-16" is a label for UTF-16LE to deal with deployed content.