|
1139 | 1139 | A UTF-8 character literal containing multiple \grammarterm{c-char}{s} is ill-formed.
|
1140 | 1140 |
|
1141 | 1141 | \pnum
|
1142 |
| -\indextext{literal!character!\tcode{char16_t}}% |
1143 |
| -\indextext{char16_t character@\tcode{char16_t} character}% |
| 1142 | +\indextext{literal!character!UTF-16}% |
1144 | 1143 | \indextext{type!\idxcode{char16_t}}%
|
1145 | 1144 | A character literal that
|
1146 | 1145 | begins with the letter \tcode{u}, such as \tcode{u'x'},
|
1147 | 1146 | \indextext{prefix!\idxcode{u}}%
|
1148 |
| -is a character literal of type \tcode{char16_t}. The value |
1149 |
| -of a \tcode{char16_t} character literal containing a single \grammarterm{c-char} is |
| 1147 | +is a character literal of type \tcode{char16_t}, |
| 1148 | +known as a \defn{UTF-16 character literal}. |
| 1149 | +The value |
| 1150 | +of a UTF-16 character literal containing a single \grammarterm{c-char} is |
1150 | 1151 | equal to its ISO/IEC 10646 code point value, provided that the code point value is
|
1151 | 1152 | representable with a single 16-bit code unit (that is, provided it is in the
|
1152 | 1153 | basic multi-lingual plane). If the value is not representable
|
1153 |
| -with a single 16-bit code unit, the program is ill-formed. A \tcode{char16_t} character literal |
| 1154 | +with a single 16-bit code unit, the program is ill-formed. |
| 1155 | +A UTF-16 character literal |
1154 | 1156 | containing multiple \grammarterm{c-char}{s} is ill-formed.
|
1155 | 1157 |
|
1156 | 1158 | \pnum
|
1157 |
| -\indextext{literal!character!\tcode{char32_t}}% |
1158 |
| -\indextext{char32_t character@\tcode{char32_t} character}% |
| 1159 | +\indextext{literal!character!UTF-32}% |
1159 | 1160 | \indextext{type!\idxcode{char32_t}}%
|
1160 | 1161 | A character literal that
|
1161 | 1162 | begins with the letter \tcode{U}, such as \tcode{U'y'},
|
1162 | 1163 | \indextext{prefix!\idxcode{U}}%
|
1163 |
| -is a character literal of type \tcode{char32_t}. The value of a |
1164 |
| -\tcode{char32_t} character literal containing a single \grammarterm{c-char} is equal |
1165 |
| -to its ISO/IEC 10646 code point value. A \tcode{char32_t} character literal containing |
| 1164 | +is a character literal of type \tcode{char32_t}, |
| 1165 | +known as a \defn{UTF-32 character literal}. |
| 1166 | +The value of a |
| 1167 | +UTF-32 character literal containing a single \grammarterm{c-char} is equal |
| 1168 | +to its ISO/IEC 10646 code point value. |
| 1169 | +A UTF-32 character literal containing |
1166 | 1170 | multiple \grammarterm{c-char}{s} is ill-formed.
|
1167 | 1171 |
|
1168 | 1172 | \pnum
|
|
1543 | 1547 | also referred to as narrow string literals.
|
1544 | 1548 |
|
1545 | 1549 | \pnum
|
1546 |
| -\indextext{literal!string!\idxcode{char16_t}}% |
| 1550 | +\indextext{literal!string!UTF-16}% |
1547 | 1551 | \indextext{type!\idxcode{char16_t}}%
|
1548 | 1552 | A \grammarterm{string-literal} that begins with \tcode{u},
|
1549 | 1553 | \indextext{prefix!\idxcode{u}}%
|
1550 | 1554 | such as \tcode{u"asdf"}, is
|
1551 |
| -a \tcode{char16_t} string literal. A \tcode{char16_t} string literal has |
| 1555 | +a \defn{UTF-16 string literal}. |
| 1556 | +A UTF-16 string literal has |
1552 | 1557 | type ``array of \placeholder{n} \tcode{const char16_t}'', where \placeholder{n} is the
|
1553 |
| -size of the string as defined below; it |
1554 |
| -is initialized with the given characters. A single \grammarterm{c-char} may |
| 1558 | +size of the string as defined below; |
| 1559 | +each successive element of the array |
| 1560 | +has the value of the corresponding code unit of |
| 1561 | +the UTF-16 encoding of the string. |
| 1562 | +\begin{note} |
| 1563 | +A single \grammarterm{c-char} may |
1555 | 1564 | produce more than one \tcode{char16_t} character in the form of
|
1556 | 1565 | surrogate pairs.
|
| 1566 | +\end{note} |
1557 | 1567 |
|
1558 | 1568 | \pnum
|
1559 |
| -\indextext{literal!string!\idxcode{char32_t}}% |
| 1569 | +\indextext{literal!string!UTF-32}% |
1560 | 1570 | \indextext{type!\idxcode{char32_t}}%
|
1561 | 1571 | A \grammarterm{string-literal} that begins with \tcode{U},
|
1562 | 1572 | \indextext{prefix!\idxcode{U}}%
|
1563 | 1573 | such as \tcode{U"asdf"}, is
|
1564 |
| -a \tcode{char32_t} string literal. A \tcode{char32_t} string literal has |
| 1574 | +a \defn{UTF-32 string literal}. |
| 1575 | +A UTF-32 string literal has |
1565 | 1576 | type ``array of \placeholder{n} \tcode{const char32_t}'', where \placeholder{n} is the
|
1566 |
| -size of the string as defined below; it |
1567 |
| -is initialized with the given characters. |
| 1577 | +size of the string as defined below; |
| 1578 | +each successive element of the array |
| 1579 | +has the value of the corresponding code unit of |
| 1580 | +the UTF-32 encoding of the string. |
1568 | 1581 |
|
1569 | 1582 | \pnum
|
1570 | 1583 | \indextext{literal!string!wide}%
|
|
0 commit comments