From e560bb6d121cd9d67054f862783bca5ddc8e2436 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Br=C3=A1ulio=20Bezerra?= Date: Tue, 5 Sep 2017 13:37:37 -0300 Subject: [PATCH 1/2] Add integer literal's grammar --- src/tokens.md | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 73 insertions(+), 4 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index 3163a9de6..9fc3d58c4 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -216,16 +216,51 @@ literal_. The grammar for recognizing the two kinds of literals is mixed. #### Integer literals +> **Lexer** +> INTEGER_LITERAL : +>    ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) +> INTEGER_SUFFIX? +> +> DEC_LITERAL : +>    DEC_DIGIT (DEC_DIGIT|`_`)\* +> +> BIN_LITERAL : +>    `0b` (BIN_DIGIT|`_`)\* BIN_DIGIT (BIN_DIGIT|`_`)\* +> +> OCT_LITERAL : +>    `0o` (OCT_DIGIT|`_`)\* OCT_DIGIT (OCT_DIGIT|`_`)\* +> +> HEX_LITERAL : +>    `0x` (HEX_DIGIT|`_`)\* HEX_DIGIT (HEX_DIGIT|`_`)\* +> +> BIN_DIGIT : [`0`-`1` `_`] +> +> OCT_DIGIT : [`0`-`7` `_`] +> +> DEC_DIGIT : [`0`-`9` `_`] +> +> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F` `_`] +> +> INTEGER_SUFFIX : +>       `u8` | `u16` | `u32` | `u64` | `usize` +>    | `i8` | `u16` | `i32` | `i64` | `usize` + + + + An _integer literal_ has one of four forms: * A _decimal literal_ starts with a *decimal digit* and continues with any mixture of *decimal digits* and _underscores_. * A _hex literal_ starts with the character sequence `U+0030` `U+0078` - (`0x`) and continues as any mixture of hex digits and underscores. + (`0x`) and continues as any mixture (with at least one digit) of hex digits + and underscores. * An _octal literal_ starts with the character sequence `U+0030` `U+006F` - (`0o`) and continues as any mixture of octal digits and underscores. + (`0o`) and continues as any mixture (with at least one digit) of octal digits + and underscores. * A _binary literal_ starts with the character sequence `U+0030` `U+0062` - (`0b`) and continues as any mixture of binary digits and underscores. + (`0b`) and continues as any mixture (with at least one digit) of binary digits + and underscores. Like any literal, an integer literal may be followed (immediately, without any spaces) by an _integer suffix_, which forcibly sets the @@ -247,15 +282,49 @@ The type of an _unsuffixed_ integer literal is determined by type inference: Examples of integer literals of various forms: ```rust +123; // type i32 123i32; // type i32 123u32; // type u32 123_u32; // type u32 +let a: u64 = 123; // type u64 + +0xff; // type i32 0xff_u8; // type u8 + +0o70; // type i32 0o70_i16; // type i16 -0b1111_1111_1001_0000_i32; // type i32 + +0b1111_1111_1001_0000; // type i32 +0b1111_1111_1001_0000i32; // type i64 +0b________1; // type i32 + 0usize; // type usize ``` +Examples of invalid integer literals: + +```rust,ignore +// invalid suffixes + +0invalidSuffix; + +// uses numbers of the wrong base + +123AFB43; +0b0102; +0o0581; + +// integers too big for their type (they overflow) + +128_i8; +256_u8; + +// bin, hex and octal literals must have at least one digit + +0b_; +0b____; +``` + Note that the Rust syntax considers `-1i8` as an application of the [unary minus operator] to an integer literal `1i8`, rather than a single integer literal. From 4de83de46d846cbba1d05f7cf41295ad3e8c281a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Br=C3=A1ulio=20Bezerra?= Date: Sat, 23 Sep 2017 09:31:30 -0300 Subject: [PATCH 2/2] Fix: digits should not include underscore --- src/tokens.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/tokens.md b/src/tokens.md index 9fc3d58c4..848112219 100644 --- a/src/tokens.md +++ b/src/tokens.md @@ -233,13 +233,13 @@ literal_. The grammar for recognizing the two kinds of literals is mixed. > HEX_LITERAL : >    `0x` (HEX_DIGIT|`_`)\* HEX_DIGIT (HEX_DIGIT|`_`)\* > -> BIN_DIGIT : [`0`-`1` `_`] +> BIN_DIGIT : [`0`-`1`] > -> OCT_DIGIT : [`0`-`7` `_`] +> OCT_DIGIT : [`0`-`7`] > -> DEC_DIGIT : [`0`-`9` `_`] +> DEC_DIGIT : [`0`-`9`] > -> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F` `_`] +> HEX_DIGIT : [`0`-`9` `a`-`f` `A`-`F`] > > INTEGER_SUFFIX : >       `u8` | `u16` | `u32` | `u64` | `usize`