Parser: recover on missing items in tuple expression #13352

auduchinok · 2022-06-22T15:28:33Z

Adds the parser recovery on missing first and intermediate items in a tuple expression, e.g. when adding a new item in the front of existing ones or between them, or when replacing an existing item.

The error is placed on the comma range:

The recovered tree is available for analysis and can be used in features like Parameter Info popup:

The change makes the parser produce the correct tree structure for cases like (,1,1) , (1,,1), and even cases with multiple missing items like (,1,,2,3,,4,).

dsyme · 2022-06-22T15:48:38Z

Looks sort of ok.

However... I wonder if we could do this much more generically across all the parser rules.

Currently

> (, 3);;

  (, 3);;
  -^

stdin(1,2): error FS0010: Unexpected symbol ',' in interaction

I'm wondering why this doesn't at least say Unexpected symbol ',' in expression - and whether there is enough information to say Unexpected symbol ',' - expected expression.

This message is produced by this code for processing any syntax error. It has access to the parser stack at the point of failure - it then looks down this stack for something it recognises, and in this case the first thing recognised is an "interaction" that is an input to FSI. That indicates that the parser stack has been unwound more than we would like. However there may be a way to avoid that.

Doing the work to adjust the generic grammar process code would have the advantage that it might improve all syntax errors, and mean we don't have to write specific rules in the grammar for cases like this. It would also cover cases like

    ( , )  // unexpected COMMA, expected expression

     let f (x: List< , int>) =  // unexpected COMMA, expected type

I think the underlying problem with doing this generically is that we don't know which subset of unexpected syntax-error tokens (COMMA, SEMICOLON, ARROW etc) should get this treatment (keeping the stack as "declExpr" or similar), and which unexpected tokens should result in an unwind of the stack (e.g. TYPE, MODULE etc). In this case you're hard-wiring one of these tokens.

auduchinok · 2022-06-22T15:54:26Z

In addition to having the parser recovered, we're also interested in getting some node for the missing code. Even if we change the generic syntax error recovery code to produce better errors, how do we get a SynExpr, SynPat, or SynType to be used in the corresponding grammar rules? It seems we'd need to modify all such rules to include the recovered productions anyway.

Or am I misunderstanding how a more generic recovery could work here? Would it somehow produce an empty/error expression (or, say, a pattern/type) whenever an expression is expected but got a comma instead?

dsyme · 2022-06-22T16:04:55Z

In addition to having parser recovered, we're also interested in getting some node for the missing code. Even if we change the generic syntax error recovery code to produce better errors, how do we get a SynExpr, SynPat, or SynType to be used in the corresponding grammar rules? It seems we'd need to modify all such cases to include the recovered case anyway.

True. I guess COMMA declExpr and COMMA and so on have a special role here as it's possible to do good repair in these cases

However looking at fantomas preview tooling it seems we're getting a good node in this case already? https://fsprojects.github.io/fantomas-tools/#/ast?data=N4KABGBEAmCmBmBLAdrAzpAXFSAacUiaAYmolmPAIYA2as%2BEkAxgPZwWQAUYuYAjAEoAOgCdhySCAC%2BQA

Which makes me wonder why we are saying "in declaration" or "in interaction" instead of "in expression"....

Note that the SyntaxError processing relies on floating the syntax error up to some construct that has error or recover recovery. It looks like it's triggering this one:

fsharp/src/Compiler/pars.fsy

Line 4708 in d91b6c5

| LPAREN recover %prec prec_atomexpr_lparen_error

.

The case

( , )

seems to also give an OK AST

I'm not saying these are perfect, and it would be wonderful if our grammar didn't have any rules like

fsharp/src/Compiler/pars.fsy

Line 4708 in d91b6c5

| LPAREN recover %prec prec_atomexpr_lparen_error

and we could somehow do the recovery processing more generically while still getting good repair.

dsyme · 2022-06-22T16:09:23Z

For comparison:

The final pattern one looks a bit whacky, the tuple contains the right-hand-side expression

auduchinok · 2022-06-22T16:12:16Z

However looking at fantomas preview tooling it seems we're getting a good node in this case already?

The parens are lost and the first item range is wrong, though. An app expression f( , 1) has a completely different structure and produces an error about an unmatched paren (so it sounds like it has something to do with the precedence of the comma?).

With the change from this PR both ( , 1) and f( , 1) produce expected results with correct expression structures, e.g. f( , 1) parses to this:

IPrefixAppExpr
  IReferenceExpr
    FSharpIdentifierToken(type:IDENTIFIER, text:f)
  IParenExpr
    FSharpTokenType+LparenTokenElement(type:LPAREN, text:()
    Whitespace(type:WHITE_SPACE, text: ) spaces:" "
    ITupleExpr
      IFromErrorExpr
      FSharpTokenType+CommaTokenElement(type:COMMA, text:,)
      Whitespace(type:WHITE_SPACE, text: ) spaces:" "
      ILiteralExpr
        FSharpToken(type:INT32, text:1)
    FSharpTokenType+RparenTokenElement(type:RPAREN, text:))

dsyme · 2022-06-22T16:30:59Z

OK, given the existence of other error recovery rules in tupleExpr I can see this makes sense to handle as a specific case

e.g.

fsharp/src/Compiler/pars.fsy

Line 4379 in d91b6c5

| declExpr COMMA ends_coming_soon_or_recover

Do you think you could write out a test matrix for all combinations up to size 3 pinning down the error recovery please?

 (,)
 (,a)
 (a,)
 (,b,c)
 (a,,c)
 (a,b,)
 (,,c)
 (,b,)
 (a,,)
 (,,)

Also could you look at similar matrix for type arguments <,int> and tuple patterns?

Thanks

auduchinok · 2022-06-22T16:33:39Z

Do you think you could write out a test matrix for all combinations up to size 3 pinning down the error recovery please?

Yes, I was going to look into adding new parser tests once the existing CI is green.

dsyme · 2022-06-22T16:37:04Z

Looking at the pattern rules for tuples, it would be kind of nice if they were exactly symmetric in structure with the ones for expressions - they certainly aren't at the moment

There are two cases really - "simple patterns" that always have parens (e.g. used as constructore arguments) and tuple patterns e.g. let x, y = pair

fsharp/src/Compiler/pars.fsy

Line 3300 in d91b6c5

| tuplePatternElements COMMA headBindingPattern

https://github.com/dotnet/fsharp/blob/main/src/Compiler/pars.fsy#L3256

auduchinok · 2022-06-23T10:25:21Z

@dsyme Thanks for suggesting to test it on tuples of size 3! It uncovered that the initial implementation didn't work properly on them and produced inner tuple expressions at recovery points. The only way I found to recognize that the parser recovered from such a situation was to check the result in the parent rule. What do you think about this approach?

dsyme · 2022-06-23T11:16:46Z

I'd be happy to take this as is, but while you're in context I'm wondering if it's worth adding tests for "(1," and other incomplete tuples?

If they don't pass it's OK, we can still accept, it would just be good to get these cases pinned down if they pass.

dsyme · 2022-06-23T11:17:13Z

Code needs formatting :)

auduchinok · 2022-06-23T11:21:11Z

I'd be happy to take this as is, but while you're in context I'm wondering if it's worth adding tests for "(1," and other incomplete tuples?

You mean incomplete tuples inside incomplete paren expressions specifically? The incomplete tuple cases are already present in this PR.

* add an extra line break in a line that has less than 120 chars * add an extra space after an capitalized pattern

auduchinok · 2022-06-23T11:35:07Z

@dsyme Done.

dsyme · 2022-06-23T21:34:18Z

@auduchinok I meant the same systematic test matrix but each without the closing parens please :) May as well get it all pinned down.

auduchinok · 2022-06-24T15:12:47Z

@dsyme I think unfinished parens aren't that interesting: IDEs insert the closing parens automatically. A much more interesting thing to look into is more cases with unfinished tuples (i.e. where the last element is missing) but I'd rather work on it separately, keeping this PR smaller and focused on missing elements before commas. After working on that it'd be much more interesting to pin more parens examples.

auduchinok · 2022-06-24T21:34:15Z

OK, missing last items are also recovered now, let's see if anything fails.

dsyme · 2022-06-28T01:35:11Z

@dsyme I think unfinished parens aren't that interesting: IDEs insert the closing parens automatically.

I understand, but regardless I feel we should pin down any positive recovery we have for these, so we don't regress in the future. It's common enough that people delete a )

I will merge this - if you do get a chance to pin down these cases I'd appreciate it, thanks

dsyme · 2022-06-28T01:35:42Z

Great work btw!

Parser: recover on missing first item in tuple expression

7c57789

auduchinok force-pushed the parser-tupleExpr branch from 0109f5d to e62f004 Compare June 23, 2022 10:21

auduchinok force-pushed the parser-tupleExpr branch from e62f004 to 4b78bdc Compare June 23, 2022 10:29

auduchinok added 2 commits June 23, 2022 13:02

Fix longer tuples, add test

2177e8d

Add another test

16d7d58

auduchinok force-pushed the parser-tupleExpr branch from 94b8454 to 16d7d58 Compare June 23, 2022 11:02

auduchinok added 2 commits June 23, 2022 13:33

Fix perfect formatting

8b25d3f

* add an extra line break in a line that has less than 120 chars * add an extra space after an capitalized pattern

Add another test

b7122c6

auduchinok changed the title ~~Parser: recover on missing first item in tuple expression~~ Parser: recover on missing items in tuple expression Jun 23, 2022

vzarytovskii added Area-Compiler-Syntax lexfilter, indentation and parsing Feature Improvement labels Jun 23, 2022

vzarytovskii added this to the Backlog milestone Jun 23, 2022

Another test

5650f08

Recover on missing last item

dab9fb3

dsyme approved these changes Jun 28, 2022

View reviewed changes

dsyme merged commit a7617f9 into dotnet:main Jun 28, 2022

auduchinok deleted the parser-tupleExpr branch September 23, 2022 13:04

auduchinok mentioned this pull request Sep 26, 2022

Parser: recover on missing items in tuple patterns #13985

Merged

Parser: recover on missing items in tuple expression #13352

Parser: recover on missing items in tuple expression #13352

Uh oh!

Conversation

auduchinok commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

auduchinok commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 22, 2022

Uh oh!

dsyme commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

auduchinok commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 22, 2022

Uh oh!

auduchinok commented Jun 22, 2022

Uh oh!

dsyme commented Jun 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

auduchinok commented Jun 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 23, 2022

Uh oh!

auduchinok commented Jun 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

auduchinok commented Jun 23, 2022

Uh oh!

dsyme commented Jun 23, 2022

Uh oh!

auduchinok commented Jun 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

auduchinok commented Jun 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dsyme commented Jun 28, 2022

Uh oh!

dsyme commented Jun 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

auduchinok commented Jun 22, 2022 •

edited

Loading

dsyme commented Jun 22, 2022 •

edited

Loading

auduchinok commented Jun 22, 2022 •

edited

Loading

dsyme commented Jun 22, 2022 •

edited

Loading

auduchinok commented Jun 22, 2022 •

edited

Loading

dsyme commented Jun 22, 2022 •

edited

Loading

auduchinok commented Jun 23, 2022 •

edited

Loading

dsyme commented Jun 23, 2022 •

edited

Loading

auduchinok commented Jun 23, 2022 •

edited

Loading

auduchinok commented Jun 24, 2022 •

edited

Loading

auduchinok commented Jun 24, 2022 •

edited

Loading