-
-
Notifications
You must be signed in to change notification settings - Fork 151
Optional chaining #137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional chaining #137
Conversation
87cc26f to
d0bbf64
Compare
|
Hey there, @maxbrunsfeld and team. I'm back with my attempts to add optional chaining to javascript (and typescript). With some trial and error, I managed to pass my tests for property access ( Those are all small changes and I hope we can implement the optional function calls without changing too much but I'm not sure what to try next. I had a brief look at the official typescript parser but it's huge and handwritten in somewhat procedural style. My next best thought is to find another javascript parser that may be written in a more declarative style, and try to imitate what they. I'll take any suggestions. |
|
My colleague @nbrahms added support for |
maxbrunsfeld
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good; thank you both for fixing this! I suggested a couple of small changes.
One slightly larger change I'd like to discuss: in member_expression and subscript_expression, the optional operator ?. is just added to the existing rule, instead of adding a new named rule (like optional_member_expression etc). But in the case of call_expression, you've added a separate named opt_arguments node.
If possible (and if it works for your use case), I'd like to handle it the same way across all three cases (member, subscript, and call).
What do you think about removing opt_arguments and just adding ?. as an optional token inside call expression? Or conversely, adding separate optional_* variations of subscript and member expression? I think I would prefer the first option (and it will result in a smaller binary size), but I am open to the second as well.
grammar.js
Outdated
| )), | ||
|
|
||
| opt_arguments: $ => prec(PREC.CALL, seq( | ||
| '?.(', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split the ?. into a separate token? I think whitespace is most likely allowed between ?. and (.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @mjambon mentioned that but this led to parsing errors when applying it on a concrete file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly right. Using seq('?.', '(', ...) did not parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What specifically was failing to parse? I'm trying this locally, with the following added unit test, and everything seems to parse ok:
============================================
Optional function calls
============================================
a[b]?.(c);
d.e?.(f);
---
(program
(expression_statement
(call_expression
(subscript_expression (identifier) (identifier))
(opt_arguments (identifier))))
(expression_statement
(call_expression
(member_expression (identifier) (property_identifier))
(opt_arguments (identifier)))))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my grammar diff:
--- a/grammar.js
+++ b/grammar.js
@@ -685,7 +685,8 @@ module.exports = grammar({
$.super,
alias($._reserved_identifier, $.identifier)
)),
- choice('[', seq('?.', '[')),
+ optional('?.'),
+ '[',
field('index', $._expressions),
']'
)),
@@ -952,7 +953,8 @@ module.exports = grammar({
)),
opt_arguments: $ => prec(PREC.CALL, seq(
- '?.(',
+ '?.',
+ '(',
commaSep(optional(choice($._expression, $.spread_element))),
')'
)),There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using your grammar.js diff, and a test file of:
x?.(y);
x ?. (y);
x?.y;
x?.[0];
new function bob (a, b) {} ?.(1, 2);I got:
➜ tree-sitter-javascript git:(optional-chains) ✗ npx tree-sitter parse opt.js
(program [0, 0] - [9, 0]
(expression_statement [0, 0] - [0, 7]
(call_expression [0, 0] - [0, 6]
function: (identifier [0, 0] - [0, 1])
(ERROR [0, 1] - [0, 3])
arguments: (arguments [0, 3] - [0, 6]
(identifier [0, 4] - [0, 5]))))
(expression_statement [2, 0] - [2, 9]
(call_expression [2, 0] - [2, 8]
function: (identifier [2, 0] - [2, 1])
(ERROR [2, 2] - [2, 4])
arguments: (arguments [2, 5] - [2, 8]
(identifier [2, 6] - [2, 7]))))
(expression_statement [4, 0] - [4, 5]
(member_expression [4, 0] - [4, 4]
object: (identifier [4, 0] - [4, 1])
property: (property_identifier [4, 3] - [4, 4])))
(expression_statement [6, 0] - [6, 7]
(subscript_expression [6, 0] - [6, 6]
object: (identifier [6, 0] - [6, 1])
index: (number [6, 4] - [6, 5])))
(expression_statement [8, 0] - [8, 36]
(call_expression [8, 0] - [8, 35]
function: (new_expression [8, 0] - [8, 26]
constructor: (function [8, 4] - [8, 26]
name: (identifier [8, 13] - [8, 16])
parameters: (formal_parameters [8, 17] - [8, 23]
(identifier [8, 18] - [8, 19])
(identifier [8, 21] - [8, 22]))
body: (statement_block [8, 24] - [8, 26])))
arguments: (opt_arguments [8, 27] - [8, 35]
(number [8, 30] - [8, 31])
(number [8, 33] - [8, 34])))))
opt.js 0 ms (ERROR [0, 1] - [0, 3])```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that tree-sitter seems to be preferring the $.arguments rule instead of $.opt_arguments here. I played with various prec settings, but that didn't help me out any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxbrunsfeld now if you run the tests, you should see an ERROR node where the ?. is:
1 failure:
expected / actual
1. Optional function calls:
(program (expression_statement (call_expression (identifier) (ERROR) (arguments (identifier)))) (expression_statement (call_expression (subscript_expression (identifier) (identifier)) (opt_arguments (identifier)))) (expression_statement (call_expression (member_expression (identifier) (property_identifier)) (opt_arguments (identifier)))))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like @mjambon beat me to it, but you can reference semgrep@7633d91 for my test case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thanks for your help here @maxbrunsfeld , definitely new to tree-sitter here and still learning the ropes!)
grammar.js
Outdated
| $.super, | ||
| alias($._reserved_identifier, $.identifier) | ||
| )), | ||
| choice('[', seq('?.', '[')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be slightly clearer to write this is as:
optional('?.'),
'[',instead of
choice('[', seq('?.', '[')),|
Yeah, this ended up requiring a bit larger of a change than I expected. The problem was some complexity that we added in #89, to deal with the inherent ambiguity between My change (a PR targeted at this branch) is at semgrep#1. |
Split `?.` and `(` into separate tokens by removing `new_expression` complexity
|
Fantastic. Thx a lot Max. Is there any chance we can get a glimpse of your debugging process to fix this issue. How did you get the intuition the problem was related to #89 and why all those what-seems-unrelated changes. |
|
Yes, absolutely. This is a good example to share, because it was a bit tricky for me to get it working. I've pushed a branch to illustrate my thought process step by step: https://github.com/tree-sitter/tree-sitter-javascript/commits/optional-chain-debugging-process. In each commit, I have a message that describes the problem. I'll reiterate some of this explanation here, discussing each commit in sequence. For some commits, I'll include screenshots of the "debug graphs" that
|
|
@maxbrunsfeld This is awesome! Thank you for providing so much detail. |
|
Thank you @maxbrunsfeld, this is the best GitHub comment I have ever seen! It believe it will boost our abilities a lot. |
javascript. See tree-sitter/tree-sitter-javascript#137 (comment) for full details. Some conflicts remain and should be solved in a later commit, since they look different than the issue we had with javascript.


This adds partial (incorrect) support for optional chaining as discussed in #134.
It kind of works because I specified creative tokens
?.[and?.(in the grammar, and the test cases don't have whitespace after the?.. The problem is that if I specify a proper?.token in front of the[of a subscript (or in front of the(of call arguments), I obtain an ERROR node in the CST in place of the?.. I don't know what to touch in terms of precedence or other tricks to make this work. I'll resume tomorrow.The current implementation also allows some optional chains as left-hand side expressions, which is illegal, but I know how to fix it (requires creating separate rules
optional_{member|subscript|call} as already suggested by @maxbrunsfeld in #134.