Skip to content

Conversation

@mlechu
Copy link
Member

@mlechu mlechu commented Nov 18, 2025

I set out to fix a stdlib compilation failure c42f/JuliaLowering.jl#98, but ran
into a limitation in the JuliaSyntax AST. Instead of hacking in a fix, this PR
attempts to address c42f/JuliaLowering.jl#77 too.

kw

The kw form isn't produced in SyntaxNode/SyntaxTree like it is in Expr,
which simplifies the parser, but causes problems once we start start caring
about the semantics of these trees. Parsing "f(a=1)" will always produce an
Expr(:call, :f, Expr(:kw, :a, 1)), but the equivalent SyntaxNode
representation (call f (= a 1)) makes Expr(:call, :f, Expr(:(=), :a, 1))
unrepresentable, even though that's valid syntax a macro could produce.

To fix this, we need to convert = to kw within certain forms before
macro expansion.

One problem

There is currently no good place to put this conversion. Here's the path
from source to lowering:

    (source text)
         |
       1 |
         v
[JS] RawGreenNode
         |      \ 5a
      2* |       \
         v        ----> Expr
[JS] SyntaxNode  /      /
         |      / 5b   /
      3* |     /      /
         v    /      /
[JL] SyntaxTree0 <--- 6
         |
       4 |
         v
[JL] SyntaxTree1
         |
         |
         v
 (desugaring and beyond)

*(2) and (3) are pretty trivial, and data structure names are used as
shorthand for what actually matters to this issue (the tree represented by the
data structure).

  1. is parsing, which is the most complex of any of these steps.
  2. JS._to_SyntaxNode, which just deletes trivia
  3. JL._convert_nodes, which is more-or-less a one-to-one conversion, and I
    think is an artifact of JuliaSyntax and JuliaLowering being in separate
    repositories. @c42f has talked about wanting to replace SyntaxNode with
    SyntaxTree eventually.
  4. Random fixups in JuliaLowering's macro expansion step to make desugaring
    easier
  5. JS._node_to_expr, which is used for femtolisp lowering (5a) and for
    JuliaLowering's compatibility with existing macros (5b)
  6. JL.expr_to_syntaxtree, which is also used for Expr-macro compatibility

Expr gets the chance to swap out = within (call (parameters ...)) in
_node_to_expr, but to change SyntaxTree we can only change parsing. (The
inverse is also true: if someone wants to change parsing to RawGreenNode,
JuliaLowering will have to eat it).

This PR implements the following instead, where _green_to_ast happens at (2)
and may fix up the AST before lowering. Ideally this means more room for parser
cleanup and eventually eliminating our "pre-desugaring" mixed in with macro
expansion.

(source text)
     |
   1 |
     v        3
RawGreenNode ---> Expr
     |
   2 |   4 -----> Expr
     v    /       /
SyntaxTree0 <----- 5
     |
     |      <- TODO delete
     v
SyntaxTree1
     |
     |
     v
(desugaring and beyond)

The catch is that if we want SyntaxTree to be different from RawGreenNode, we do
need to convert them to Expr differently. There are a few things we could do:

  • Separate the RawGreenNode->Expr and SyntaxTree->Expr transformations
    (pictured, and implemented).
  • Make RawGreenNode->SyntaxTree a necessary prerequisite for creating an Expr.
    This would add a step to our current femtolisp-lowered parsing (why)
  • Give up on making SyntaxTree structurally different from Expr (I wouldn't
    mind this) and try to feed them through the same generic code (not so sure
    about that).

For now I've done the lazy thing by using a boolean parameter to implement the
first option.

I also took this opportunity to put the green tree into the syntax graph. This
doesn't really change anything given how little we look at the green tree, but
wasn't difficult, and it's a neat representation.

kw notes

This change uses K"kw" in SyntaxTree wherever a textual = behaves more like
a =>, or if = is a parse error, wherever it would be more consistent to
treat it like =>.

Note that this is slightly different from Expr. Each new place we produce kw
is either a place where = had kw semantics (only tuple) or an invalid
location for =, though, so everything is representable. This should fix the
overlap problem, and could allow us to permit real assignment in each kw cell
if we really wanted that.

Kind Source Expr SyntaxNode (proposed)
call* x(a=1,;b=1) (kw a 1) (kw b 1) (kw a 1) (kw b 1)
dotcall x.(a=1,;b=1) (kw a 1) (kw b 1) (kw a 1) (kw b 1)
ref x[a=1,;b=1] (kw a 1) (= b 1) (kw a 1) (kw b 1)
curly x{a=1,;b=1} (= a 1) (= b 1) (kw a 1) (kw b 1)
tuple (a=1,;b=1) (= a 1) (kw b 1) (kw a 1) (kw b 1)
vect [a=1,;b=1] (= a 1) (= b 1) (= a 1) (kw b 1)
braces {a=1,;b=1} (= a 1) (= b 1) (= a 1) (kw b 1)
macrocall @x(a=1,;b=1) (= a 1) (kw b 1) (= a 1) (kw b 1)

@mlechu mlechu requested review from Keno and c42f November 18, 2025 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant