From f2ee2103d15b86c50996b1f5826c1a9089f97e04 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 08:34:44 -0500 Subject: [PATCH 01/13] Macro future proofing draft (no impl details yet) --- text/0000-macro-future-proofing.md | 103 +++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) create mode 100644 text/0000-macro-future-proofing.md diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md new file mode 100644 index 00000000000..acfd52d931f --- /dev/null +++ b/text/0000-macro-future-proofing.md @@ -0,0 +1,103 @@ +- Start Date: 2014-12-21 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +**NOTE**: Draft, not finalized. + +# Key Terminology + +- `macro`: anything invokable as `foo!(...)` in source code. +- `syntax extension`: a plugin to `rustc` that can provide macros or special + handling for certain attributes. +- `MBE`: macro-by-example, a macro defined by `macro_rules`. +- `matcher`: the left-hand-side of a rule in a `macro_rules` invocation. +- `macro parser`: the bit of code in the Rust parser that will parse the input + using a grammar derived from all of the matchers. +- `NT`: non-terminal, the various "meta-variables" that can appear in a matcher. +- `fragment`: The piece of Rust syntax that an NT can accept. +- `fragment specifier`: The identifier in an NT that specifies which fragment + the NT accepts. +- `language`: a context-free language. + +Example: + +```rust +macro_rules! i_am_an_mbe { + (start $foo:expr end) => ($foo), +} +``` + +`(start $foo:expr end)` is a matcher, `$foo` is an NT with `expr` as its +fragment specifier. + +# Summary + +Future-proof the allowed forms that input to an MBE can take by requiring +certain delimiters following meta variables in a matcher. + +# Motivation + +In current Rust, the `macro_rules` parser is very liberal in what it accepts +in a matcher. This can cause problems, because it is possible to write an +MBE which corresponds to an ambiguous grammar. When an MBE is invoked, if the +macro parser encounters an amibuity while parsing, it will bail out with a +"local ambiguity" error. As an example for this, take the following MBE: + +```rust +macro_rules! foo { + ($($foo:expr)* $bar:block) => (/*...*/) +}; +``` + +Attempts to invoke this MBE will never succeed, because the macro parser +will always emit an ambiguity error rather than make a choice when presented +an ambiguity. In particular, it needs to decide when to stop accepting +expressions for `foo` and look for a block for `bar` (noting that blocks are +valid expressions). Situations like this are inherent to the macro system. On +the other hand, it's possible to write an unambiguous matcher that becomes +ambiguous due to changes in the syntax for the various fragments. As a +concrete example: + +```rust +macro_rules! bar { + ($in:ty ( $($arg:ident)*, ) -> $out:ty;) => (/*...*/) +}; +``` + +When the type syntax was extended to include the unboxed closure traits, +an input such as `FnMut(i8, u8) -> i8;` became ambiguous. The goal of this +proposal is to prevent such scenarios in the future by requiring certain +"delimiter tokens" after an NT. When extending Rust's syntax in the future, +ambiguity need only be considered when combined with these sets of delimiters, +rather than any possible arbitrary matcher. + +# Detailed design + +This is the bulk of the RFC. Explain the design in enough detail for somebody familiar +with the language to understand, and for somebody familiar with the compiler to implement. +This should get into specifics and corner-cases, and include examples of how the feature is used. + +# Drawbacks + +It does restrict the input to a MBE, but the choice of delimiters provides +reasonable freedom. + +# Alternatives + +1. Fix the syntax that a fragment can parse. This would create a situation + where a future MBE might not be able to accept certain inputs because the + input uses newer features than the fragment that was fixed at 1.0. For + example, in the `bar` MBE above, if the `ty` fragment was fixed before the + unboxed closure sugar was introduced, the MBE would not be able to accept + such a type. While this approach is feasible, it would cause unnecessary + confusion for future users of MBEs when they can't put certain perfectly + valid Rust code in the input to an MBE. Versioned fragments could avoid + this problem but only for new code. +2. Keep `macro_rules` unstable. Given the great syntactical abstraction that + `macro_rules` provides, it would be a shame for it to be unusable in a + release version of Rust. If ever `macro_rules` were to be stabilized, this + same issue would come up. +3. Do nothing. This is very dangerous, and has the potential to essentially + freeze Rust's syntax for fear of accidentally breaking a macro. + +# Unresolved questions From cc7f0e1b8c94c64f5881439074c6f04733757886 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 08:39:02 -0500 Subject: [PATCH 02/13] Remove 'meta variable' reference --- text/0000-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index acfd52d931f..d00c3ff0c41 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -33,7 +33,7 @@ fragment specifier. # Summary Future-proof the allowed forms that input to an MBE can take by requiring -certain delimiters following meta variables in a matcher. +certain delimiters following NTs in a matcher. # Motivation From 2ebedee4c827d8945c0a30e58213584afdad1c6e Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 08:42:51 -0500 Subject: [PATCH 03/13] Remove invalid macro_rules syntax --- text/0000-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index d00c3ff0c41..abf4ba62609 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -23,7 +23,7 @@ Example: ```rust macro_rules! i_am_an_mbe { - (start $foo:expr end) => ($foo), + (start $foo:expr end) => ($foo) } ``` From 34a3bd63387e2201d67ddc6760300b83296bfac1 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 11:14:03 -0500 Subject: [PATCH 04/13] Finish up the RFC with the algorithm --- text/0000-macro-future-proofing.md | 54 +++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 4 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index abf4ba62609..e741bac1b82 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -33,7 +33,8 @@ fragment specifier. # Summary Future-proof the allowed forms that input to an MBE can take by requiring -certain delimiters following NTs in a matcher. +certain delimiters following NTs in a matcher. In the future, it will be +possible to lift these restrictions backwards compatibly if desired. # Motivation @@ -73,9 +74,52 @@ rather than any possible arbitrary matcher. # Detailed design -This is the bulk of the RFC. Explain the design in enough detail for somebody familiar -with the language to understand, and for somebody familiar with the compiler to implement. -This should get into specifics and corner-cases, and include examples of how the feature is used. +The algorithm for recognizing valid matchers `M` follows. Note that a matcher +is merely a token tree. A "simple NT" is an NT without repetitions. That is, +`$foo:ty` is a simple NT but `$($foo:ty)+` is not. `FOLLOW(NT)` is the set of +allowed tokens for the given NT's fragment specifier, and is defined below. +`F` is used for representing the separator in complex NTs. In `$($foo:ty),+`, +`F` would be `,`, and for `$($foo:ty)+`, `F` would be `EOF`. + +*input*: a token tree `M` representing a matcher, and optionally a token `F` +*output*: whether M is valid +1. If there are no tokens in `M`, accept. +2. For each token `T` in `M`: + 1. If `T` is not an NT, continue. + 2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If + `T'` is `EOF`, replace `T'` with `F` if present. If `T'` is in the set + `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, or `T'` is any identifier, + continue. + 3. Else, `T` is a complex NT. + 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on + the contents with `F` set to `EOF`. If it accepts, continue, else, + reject. + 2. If `T` has the form `$(...)U+` or $(...)U*` for some token `U`, run + the algorithm on the contents with `F` set to `U`. If it accepts, + continue, else, reject. + +This algorithm should be run on every matcher in every `macro_rules` +invocation. If it rejects a matcher, an error should be emitted and +compilation should not complete. + +The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, +`expr`, `ty`, `ident`, `path`, `meta`, and `tt`. + +- `FOLLOW(item)` = `{}` +- `FOLLOW(block)` = `FOLLOW(expr)` +- `FOLLOW(stmt)` = `FOLLOW(expr)` +- `FOLLOW(pat)` = `{FatArrow, Comma}` +- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, Lit}` (where + `Lit` is any numeric literal) +- `FOLLOW(ty)` = `{Comma, Eq, Gt, Lt, RArrow, FatArrow, OpenBrace, OpenParen, + CloseBrace, CloseParen}` +- `FOLLOW(ident)` = any token +- `FOLLOW(path)` = any token +- `FOLLOW(meta)` = any token +- `FOLLOW(tt)` = any token + +**Note**: the `FOLLOW` sets as given are based on every MBE in the Rust +distribution, but should probably be tuned before the RFC is accepted. # Drawbacks @@ -101,3 +145,5 @@ reasonable freedom. freeze Rust's syntax for fear of accidentally breaking a macro. # Unresolved questions + +Are the given `FOLLOW` sets adequate? From 645f679226d32c4288daf7ab3a1a8ea41840cf79 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 11:18:18 -0500 Subject: [PATCH 05/13] Formatting --- text/0000-macro-future-proofing.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index e741bac1b82..29a755e2ff9 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -82,7 +82,9 @@ allowed tokens for the given NT's fragment specifier, and is defined below. `F` would be `,`, and for `$($foo:ty)+`, `F` would be `EOF`. *input*: a token tree `M` representing a matcher, and optionally a token `F` + *output*: whether M is valid + 1. If there are no tokens in `M`, accept. 2. For each token `T` in `M`: 1. If `T` is not an NT, continue. From 849c0ae5df803e9f02c6b2eccd23a5701156705e Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 11:23:04 -0500 Subject: [PATCH 06/13] Remove optionality of F --- text/0000-macro-future-proofing.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 29a755e2ff9..3d14a87a82f 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -81,7 +81,7 @@ allowed tokens for the given NT's fragment specifier, and is defined below. `F` is used for representing the separator in complex NTs. In `$($foo:ty),+`, `F` would be `,`, and for `$($foo:ty)+`, `F` would be `EOF`. -*input*: a token tree `M` representing a matcher, and optionally a token `F` +*input*: a token tree `M` representing a matcher and a token `F` *output*: whether M is valid @@ -89,9 +89,9 @@ allowed tokens for the given NT's fragment specifier, and is defined below. 2. For each token `T` in `M`: 1. If `T` is not an NT, continue. 2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If - `T'` is `EOF`, replace `T'` with `F` if present. If `T'` is in the set + `T'` is `EOF`, replace `T'` with `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, or `T'` is any identifier, - continue. + continue. Else, reject. 3. Else, `T` is a complex NT. 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the contents with `F` set to `EOF`. If it accepts, continue, else, @@ -101,8 +101,8 @@ allowed tokens for the given NT's fragment specifier, and is defined below. continue, else, reject. This algorithm should be run on every matcher in every `macro_rules` -invocation. If it rejects a matcher, an error should be emitted and -compilation should not complete. +invocation, with `F` as `EOF`. If it rejects a matcher, an error should be +emitted and compilation should not complete. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. From 8b475b5c93aad83ae026f4876eb5d582abd4a744 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 22 Dec 2014 11:24:04 -0500 Subject: [PATCH 07/13] Typo, unused glossary entry --- text/0000-macro-future-proofing.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 3d14a87a82f..14912e8bffc 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -7,8 +7,6 @@ # Key Terminology - `macro`: anything invokable as `foo!(...)` in source code. -- `syntax extension`: a plugin to `rustc` that can provide macros or special - handling for certain attributes. - `MBE`: macro-by-example, a macro defined by `macro_rules`. - `matcher`: the left-hand-side of a rule in a `macro_rules` invocation. - `macro parser`: the bit of code in the Rust parser that will parse the input @@ -41,7 +39,7 @@ possible to lift these restrictions backwards compatibly if desired. In current Rust, the `macro_rules` parser is very liberal in what it accepts in a matcher. This can cause problems, because it is possible to write an MBE which corresponds to an ambiguous grammar. When an MBE is invoked, if the -macro parser encounters an amibuity while parsing, it will bail out with a +macro parser encounters an ambiguity while parsing, it will bail out with a "local ambiguity" error. As an example for this, take the following MBE: ```rust From bca17b465cb58cc293ab29cf280ad67cf5b5ceed Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Thu, 25 Dec 2014 21:29:01 -0500 Subject: [PATCH 08/13] Add missing backtick --- text/0000-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 14912e8bffc..78be1fdf51c 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -94,7 +94,7 @@ allowed tokens for the given NT's fragment specifier, and is defined below. 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the contents with `F` set to `EOF`. If it accepts, continue, else, reject. - 2. If `T` has the form `$(...)U+` or $(...)U*` for some token `U`, run + 2. If `T` has the form `$(...)U+` or `$(...)U*` for some token `U`, run the algorithm on the contents with `F` set to `U`. If it accepts, continue, else, reject. From b5223791493302d7096875413e017bbe51d9985b Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Thu, 25 Dec 2014 21:29:11 -0500 Subject: [PATCH 09/13] Add Pipe to FOLLOW(pat) --- text/0000-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 78be1fdf51c..13e74791388 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -108,7 +108,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, - `FOLLOW(item)` = `{}` - `FOLLOW(block)` = `FOLLOW(expr)` - `FOLLOW(stmt)` = `FOLLOW(expr)` -- `FOLLOW(pat)` = `{FatArrow, Comma}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Pipe}` - `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, Lit}` (where `Lit` is any numeric literal) - `FOLLOW(ty)` = `{Comma, Eq, Gt, Lt, RArrow, FatArrow, OpenBrace, OpenParen, From 5ef6de51f48d8a226a340637717c2167c8d8ca5d Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Thu, 25 Dec 2014 21:29:21 -0500 Subject: [PATCH 10/13] Don't discriminate against string literals --- text/0000-macro-future-proofing.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 13e74791388..c6edf0b1d42 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -110,7 +110,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(pat)` = `{FatArrow, Comma, Pipe}` - `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, Lit}` (where - `Lit` is any numeric literal) + `Lit` is any literal, string or numeric) - `FOLLOW(ty)` = `{Comma, Eq, Gt, Lt, RArrow, FatArrow, OpenBrace, OpenParen, CloseBrace, CloseParen}` - `FOLLOW(ident)` = any token From 68ecb347d2c2a736478cb0367f9996a844570f36 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Fri, 2 Jan 2015 23:05:01 -0500 Subject: [PATCH 11/13] Minor fixes, adjustments to FOLLOW sets --- text/0000-macro-future-proofing.md | 32 ++++++++++++++---------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index c6edf0b1d42..5eeb17a1944 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -2,8 +2,6 @@ - RFC PR: (leave this empty) - Rust Issue: (leave this empty) -**NOTE**: Draft, not finalized. - # Key Terminology - `macro`: anything invokable as `foo!(...)` in source code. @@ -87,9 +85,9 @@ allowed tokens for the given NT's fragment specifier, and is defined below. 2. For each token `T` in `M`: 1. If `T` is not an NT, continue. 2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If - `T'` is `EOF`, replace `T'` with `F`. If `T'` is in the set - `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, or `T'` is any identifier, - continue. Else, reject. + `T'` is `EOF` or a close delimiter of a token tree, replace `T'` with + `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, + or `T'` is any identifier, continue. Else, reject. 3. Else, `T` is a complex NT. 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the contents with `F` set to `EOF`. If it accepts, continue, else, @@ -105,21 +103,16 @@ emitted and compilation should not complete. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(item)` = `{}` -- `FOLLOW(block)` = `FOLLOW(expr)` - `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(pat)` = `{FatArrow, Comma, Pipe}` -- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, Lit}` (where - `Lit` is any literal, string or numeric) -- `FOLLOW(ty)` = `{Comma, Eq, Gt, Lt, RArrow, FatArrow, OpenBrace, OpenParen, - CloseBrace, CloseParen}` +- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, CloseBracket}` +- `FOLLOW(ty)` = `{Comma, CloseBrace, CloseParen, CloseBracket}` +- `FOLLOW(block)` = any token - `FOLLOW(ident)` = any token -- `FOLLOW(path)` = any token -- `FOLLOW(meta)` = any token - `FOLLOW(tt)` = any token - -**Note**: the `FOLLOW` sets as given are based on every MBE in the Rust -distribution, but should probably be tuned before the RFC is accepted. +- `FOLLOW(item)` = up for discussion +- `FOLLOW(path)` = up for discussion +- `FOLLOW(meta)` = up for discussion # Drawbacks @@ -146,4 +139,9 @@ reasonable freedom. # Unresolved questions -Are the given `FOLLOW` sets adequate? +1. What should the FOLLOW sets for `item`, `path`, and `meta` be? +2. Should the `FOLLOW` set for `ty` be extended? In practice, `RArrow`, + `Colon`, `as`, and `in` are also used. (See next item) +2. What, if any, identifiers should be allowed in the FOLLOW sets? The author + is concerned that allowing arbitrary identifiers would limit the future use + of "contextual keywords". From 5db222a8a2d15b5c34c894c7cc0f15b59c6c9777 Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Mon, 5 Jan 2015 13:11:19 -0500 Subject: [PATCH 12/13] Finalize FOLLOW sets --- text/0000-macro-future-proofing.md | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 5eeb17a1944..2f8791936e6 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -87,7 +87,7 @@ allowed tokens for the given NT's fragment specifier, and is defined below. 2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If `T'` is `EOF` or a close delimiter of a token tree, replace `T'` with `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, - or `T'` is any identifier, continue. Else, reject. + or `T'` is any close delimiter, continue. Else, reject. 3. Else, `T` is a complex NT. 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the contents with `F` set to `EOF`. If it accepts, continue, else, @@ -104,15 +104,17 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. - `FOLLOW(stmt)` = `FOLLOW(expr)` -- `FOLLOW(pat)` = `{FatArrow, Comma, Pipe}` -- `FOLLOW(expr)` = `{Comma, FatArrow, CloseBrace, CloseParen, CloseBracket}` -- `FOLLOW(ty)` = `{Comma, CloseBrace, CloseParen, CloseBracket}` +- `FOLLOW(pat)` = `{FatArrow, Comma, Eq}` +- `FOLLOW(expr)` = `{Comma, Semicolon}` +- `FOLLOW(path)` = `FOLLOW(ty)` +- `FOLLOW(ty)` = `{Comma, RArrow, Colon, Eq, Gt, Ident(as)}` - `FOLLOW(block)` = any token - `FOLLOW(ident)` = any token - `FOLLOW(tt)` = any token -- `FOLLOW(item)` = up for discussion -- `FOLLOW(path)` = up for discussion -- `FOLLOW(meta)` = up for discussion +- `FOLLOW(item)` = any token +- `FOLLOW(meta)` = any token + +(Note that close delimiters are valid following any NT.) # Drawbacks @@ -136,12 +138,3 @@ reasonable freedom. same issue would come up. 3. Do nothing. This is very dangerous, and has the potential to essentially freeze Rust's syntax for fear of accidentally breaking a macro. - -# Unresolved questions - -1. What should the FOLLOW sets for `item`, `path`, and `meta` be? -2. Should the `FOLLOW` set for `ty` be extended? In practice, `RArrow`, - `Colon`, `as`, and `in` are also used. (See next item) -2. What, if any, identifiers should be allowed in the FOLLOW sets? The author - is concerned that allowing arbitrary identifiers would limit the future use - of "contextual keywords". From b61c42a5956b9eeb6df1994b5c37691be31baeff Mon Sep 17 00:00:00 2001 From: Corey Richardson Date: Sun, 18 Jan 2015 10:34:42 -0500 Subject: [PATCH 13/13] Algorithm and follow tweaks to match what landed --- text/0000-macro-future-proofing.md | 37 +++++++++++++++--------------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/text/0000-macro-future-proofing.md b/text/0000-macro-future-proofing.md index 2f8791936e6..84d238c9f1b 100644 --- a/text/0000-macro-future-proofing.md +++ b/text/0000-macro-future-proofing.md @@ -81,20 +81,21 @@ allowed tokens for the given NT's fragment specifier, and is defined below. *output*: whether M is valid -1. If there are no tokens in `M`, accept. -2. For each token `T` in `M`: - 1. If `T` is not an NT, continue. - 2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If - `T'` is `EOF` or a close delimiter of a token tree, replace `T'` with - `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, `T'` is any NT, - or `T'` is any close delimiter, continue. Else, reject. - 3. Else, `T` is a complex NT. - 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on - the contents with `F` set to `EOF`. If it accepts, continue, else, - reject. - 2. If `T` has the form `$(...)U+` or `$(...)U*` for some token `U`, run - the algorithm on the contents with `F` set to `U`. If it accepts, - continue, else, reject. +For each token `T` in `M`: + +1. If `T` is not an NT, continue. +2. If `T` is a simple NT, look ahead to the next token `T'` in `M`. If + `T'` is `EOF` or a close delimiter of a token tree, replace `T'` with + `F`. If `T'` is in the set `FOLLOW(NT)`, `T'` is EOF, or `T'` is any close + delimiter, continue. Otherwise, reject. +3. Else, `T` is a complex NT. + 1. If `T` has the form `$(...)+` or `$(...)*`, run the algorithm on the + contents with `F` set to the token following `T`. If it accepts, + continue, else, reject. + 2. If `T` has the form `$(...)U+` or `$(...)U*` for some token `U`, run + the algorithm on the contents with `F` set to `U`. If it accepts, + check that the last token in the sequence can be followed by `F`. If + so, accept. Otherwise, reject. This algorithm should be run on every matcher in every `macro_rules` invocation, with `F` as `EOF`. If it rejects a matcher, an error should be @@ -103,11 +104,11 @@ emitted and compilation should not complete. The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, `expr`, `ty`, `ident`, `path`, `meta`, and `tt`. -- `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(pat)` = `{FatArrow, Comma, Eq}` -- `FOLLOW(expr)` = `{Comma, Semicolon}` +- `FOLLOW(expr)` = `{FatArrow, Comma, Semicolon}` +- `FOLLOW(ty)` = `{Comma, FatArrow, Colon, Eq, Gt, Ident(as)}` +- `FOLLOW(stmt)` = `FOLLOW(expr)` - `FOLLOW(path)` = `FOLLOW(ty)` -- `FOLLOW(ty)` = `{Comma, RArrow, Colon, Eq, Gt, Ident(as)}` - `FOLLOW(block)` = any token - `FOLLOW(ident)` = any token - `FOLLOW(tt)` = any token @@ -119,7 +120,7 @@ The current legal fragment specifiers are: `item`, `block`, `stmt`, `pat`, # Drawbacks It does restrict the input to a MBE, but the choice of delimiters provides -reasonable freedom. +reasonable freedom and can be extended in the future. # Alternatives