From dc8f32d43bb7952a625677b38c4126179b1c3f3e Mon Sep 17 00:00:00 2001 From: Christoph Knittel Date: Tue, 30 Sep 2025 09:31:16 +0200 Subject: [PATCH 1/4] [RFC] Early return --- docs/early-return-design.md | 76 +++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 docs/early-return-design.md diff --git a/docs/early-return-design.md b/docs/early-return-design.md new file mode 100644 index 0000000000..8a60498fdf --- /dev/null +++ b/docs/early-return-design.md @@ -0,0 +1,76 @@ +# Design: Early Return in Functions + +## Context & Motivation +Developers regularly ask for a lightweight way to exit a function before its final expression. Today they must emulate early exits using nested conditionals, exceptions, or helper functions, which obscures intent and bloats JavaScript output. Supporting a first-class `return` keyword improves readability, enables more idiomatic interop with JavaScript, and narrows the ergonomics gap with other languages while preserving ReScript's expression-oriented style. + +## Goals +- Introduce a `return` expression that exits the innermost function, optionally carrying a value (`return expr` or `return;`). +- Type-check `return` so that subsequent code is treated as unreachable, avoiding spurious exhaustiveness warnings. +- Emit direct JavaScript `return` statements to make async and `try` interactions behave exactly like plain JS. +- Preserve backward compatibility for existing code that does not use `return`. + +## Non-goals +- Adding multi-value returns or early exit for non-function constructs (loops, switches without functions, etc.). +- Introducing new runtime constructs beyond the emitted JavaScript `return`. +- Changing module-level or top-level behaviour; `return` remains illegal outside function bodies. + +## Semantics Overview +- `return` is an expression with the bottom-like type `never`. The payload, when present, must unify with the enclosing function's declared result. +- `return` targets only the innermost function scope, including anonymous functions and closures. +- `return;` is syntactic sugar for `return ();` but keeps type `never` so the function's result type must be `unit`. +- Once a `return` is evaluated, control flow stops at that point; subsequent expressions in the same block are unreachable. + +## Syntax Layer Changes (`compiler/syntax/`) +- Extend the grammar in `parser.mly` to parse `return` as a `simple_expr` with an optional trailing expression (`return`, `return expr`). +- Add `Pexp_return of expression option` to `parsetree.ml`, and update related helpers (`ast_iterator`, printers, etc.). +- Mirror the changes in `ast_mapper_from0.ml` and `ast_mapper_to0.ml` to maintain compatibility with `parsetree0.ml` (which must stay frozen). +- Update syntax error recovery to produce messages such as “`return` is only allowed inside function bodies" when seen in invalid positions. +- Add new parser fixtures under `tests/syntax_tests/` (positive and negative cases). + +## Typed Tree & Type Checking (`compiler/ml/`) +- Introduce `Texp_return` in `Typedtree.expression_desc` and a corresponding record in `Typedtree_helper` utilities. +- Extend `typecore.ml` to: + - Ensure we are inside a function context (reusing or extending `env.in_function`). + - Type-check the optional payload against the enclosing function's result type. + - Assign the new `never` type to the expression so downstream phases treat it as non-returning. + - Register an error when used outside functions or when the payload type mismatches. +- Introduce a dedicated bottom type constructor (`never`) if one does not already exist: + - Extend `Types.type_desc` and helpers in `btype.ml` / `ctype.ml` with `Tnever` (or similar) plus `Predef` registration, including printer support in `printtyp.ml`. + - Update utility predicates (`Types.maybe_bottom`, dead-code checks) to understand the new type. +- Ensure exhaustiveness and dead code analysis (e.g. `parmatch.ml`, `clflags.warn_error`) treat `never` as non-fallthrough so we avoid double warnings. +- Update typed tree iterators and printers (`TypedtreeIter`, `Printtyped`) to handle `Texp_return`. + +## Lambda IR Translation (`compiler/core/`) +- Extend `lam.ml` with an `Lreturn of lambda option` constructor (or reuse existing exit nodes if we can adapt them). +- Modify `translcore.ml` (and related helpers) to translate `Texp_return` into the new lambda form, marking the generated continuation as finished. +- Adjust passes that manipulate control flow: + - Ensure `lam_pass_exits`, `lam_dce`, and similar optimizations treat `Lreturn` as terminating. + - Update `lam_print.ml` and analysis utilities to print and traverse the new node. + +## JavaScript Backend (`compiler/core/js_*`) +- Update JS lowering (`lam_compile.ml`, `js_output.ml`) so lambda outputs marked as “finished” get converted to `return_stmt payload` and no additional implicit return is appended. +- Ensure `switch`/`if` lowering avoids emitting duplicate `return` statements when a branch already ends with `return`. This likely relies on `output_finished = True` plumbing already used by `throw` and existing returns. +- Adjust `js_stmt_make` / `js_exp_make` to expose helper constructors where needed, and audit passes like `js_pass_flatten.ml` to respect terminating statements. +- Validate async helpers and promise sugar to confirm the generated functions contain direct `return` statements, ensuring semantics match JavaScript. + +## Tooling & Diagnostics +- Update AST printers (`pprintast.ml`, `js_dump_*`) to display `return` expressions. +- Extend the language server (`analysis/`) to surface the new node in hover/type info and to provide quick-fix diagnostics. +- Document the feature in `docs/Syntax.md`, including examples and restrictions. + +## Migration & Compatibility +- Existing code continues to compile; no change to default behaviour. +- PPX compatibility: because `parsetree0.ml` remains frozen, PPXs continue to receive the v0 AST without `return`. We maintain compatibility by mapping `Pexp_return` to/from the v0 representation through `ast_mapper_from0` / `ast_mapper_to0`. +- JavaScript output remains stable aside from functions that now contain explicit `return` statements when developers opt in to the new feature. + +## Testing Strategy +- **Syntax tests**: new fixtures for valid/invalid `return` usages, nested functions, and top-level errors. +- **Typechecker tests** (`tests/ounit_tests/` or similar): ensure payload type mismatches raise errors, unreachable code warnings are produced, and nested function scoping works. +- **Lambda / JS IR tests**: add golden-print tests verifying `Lreturn` in `lam_print` and generated JS blocks for representative cases (`if`, `switch`, `try/finally`, async wrappers). +- **Integration tests** (`tests/build_tests/`): demonstrate runtime behaviour, including interaction with promise helpers and exceptions. + +## Open Questions & Follow-ups +- Does the compiler already expose a notion of bottom in other phases? If so, integrate rather than re-invent; otherwise, ensure `never` is threaded consistently (e.g. into `Predef` and external tooling). +- Should `return` be allowed inside `fun { } =>` pipelines or only as a standalone statement expression? Current proposal allows it anywhere expressions are permitted, but we should validate editor ergonomics and readability. +- Consider introducing linting guidance to discourage overuse in expression-heavy code while still allowing pragmatic escapes. + From 79e20cb41f685347c10241538bd7a7a88fe52406 Mon Sep 17 00:00:00 2001 From: Christoph Knittel Date: Tue, 30 Sep 2025 13:53:45 +0200 Subject: [PATCH 2/4] Feedback --- docs/early-return-design.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/early-return-design.md b/docs/early-return-design.md index 8a60498fdf..8ccba41a48 100644 --- a/docs/early-return-design.md +++ b/docs/early-return-design.md @@ -4,7 +4,7 @@ Developers regularly ask for a lightweight way to exit a function before its final expression. Today they must emulate early exits using nested conditionals, exceptions, or helper functions, which obscures intent and bloats JavaScript output. Supporting a first-class `return` keyword improves readability, enables more idiomatic interop with JavaScript, and narrows the ergonomics gap with other languages while preserving ReScript's expression-oriented style. ## Goals -- Introduce a `return` expression that exits the innermost function, optionally carrying a value (`return expr` or `return;`). +- Introduce a `return` expression that exits the innermost function, optionally carrying a value (`return expr` or bare `return`). - Type-check `return` so that subsequent code is treated as unreachable, avoiding spurious exhaustiveness warnings. - Emit direct JavaScript `return` statements to make async and `try` interactions behave exactly like plain JS. - Preserve backward compatibility for existing code that does not use `return`. @@ -17,11 +17,11 @@ Developers regularly ask for a lightweight way to exit a function before its fin ## Semantics Overview - `return` is an expression with the bottom-like type `never`. The payload, when present, must unify with the enclosing function's declared result. - `return` targets only the innermost function scope, including anonymous functions and closures. -- `return;` is syntactic sugar for `return ();` but keeps type `never` so the function's result type must be `unit`. +- A bare `return` is sugar for returning `unit`, still typed as `never`. - Once a `return` is evaluated, control flow stops at that point; subsequent expressions in the same block are unreachable. ## Syntax Layer Changes (`compiler/syntax/`) -- Extend the grammar in `parser.mly` to parse `return` as a `simple_expr` with an optional trailing expression (`return`, `return expr`). +- Extend the grammar handled in `compiler/syntax/src/res_parser.ml` (and related helpers such as `res_grammar.ml`) to parse `return` as an expression with an optional trailing payload (`return` or `return expr`). - Add `Pexp_return of expression option` to `parsetree.ml`, and update related helpers (`ast_iterator`, printers, etc.). - Mirror the changes in `ast_mapper_from0.ml` and `ast_mapper_to0.ml` to maintain compatibility with `parsetree0.ml` (which must stay frozen). - Update syntax error recovery to produce messages such as “`return` is only allowed inside function bodies" when seen in invalid positions. @@ -71,6 +71,5 @@ Developers regularly ask for a lightweight way to exit a function before its fin ## Open Questions & Follow-ups - Does the compiler already expose a notion of bottom in other phases? If so, integrate rather than re-invent; otherwise, ensure `never` is threaded consistently (e.g. into `Predef` and external tooling). -- Should `return` be allowed inside `fun { } =>` pipelines or only as a standalone statement expression? Current proposal allows it anywhere expressions are permitted, but we should validate editor ergonomics and readability. +- Validate how `return` reads inside pipeline-heavy expressions; current proposal allows it everywhere, but we should document guidance if certain patterns feel awkward. - Consider introducing linting guidance to discourage overuse in expression-heavy code while still allowing pragmatic escapes. - From 65e9a595d028afdff6616f42ffa3e5830fa04c1a Mon Sep 17 00:00:00 2001 From: Christoph Knittel Date: Wed, 1 Oct 2025 16:25:32 +0200 Subject: [PATCH 3/4] Some updates after Codex investigating unreachable code detection in more detail --- docs/early-return-design.md | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/docs/early-return-design.md b/docs/early-return-design.md index 8ccba41a48..a59bb0596a 100644 --- a/docs/early-return-design.md +++ b/docs/early-return-design.md @@ -28,16 +28,15 @@ Developers regularly ask for a lightweight way to exit a function before its fin - Add new parser fixtures under `tests/syntax_tests/` (positive and negative cases). ## Typed Tree & Type Checking (`compiler/ml/`) -- Introduce `Texp_return` in `Typedtree.expression_desc` and a corresponding record in `Typedtree_helper` utilities. +- Introduce `Texp_return` in `Typedtree.expression_desc` (update `compiler/ml/typedtree.mli` and `typedtree.ml`) and thread it through the existing iterators/printers. - Extend `typecore.ml` to: - - Ensure we are inside a function context (reusing or extending `env.in_function`). + - Reject uses outside functions by reusing the existing optional `in_function` plumbing that `type_function` already threads through. - Type-check the optional payload against the enclosing function's result type. - - Assign the new `never` type to the expression so downstream phases treat it as non-returning. - - Register an error when used outside functions or when the payload type mismatches. -- Introduce a dedicated bottom type constructor (`never`) if one does not already exist: - - Extend `Types.type_desc` and helpers in `btype.ml` / `ctype.ml` with `Tnever` (or similar) plus `Predef` registration, including printer support in `printtyp.ml`. - - Update utility predicates (`Types.maybe_bottom`, dead-code checks) to understand the new type. -- Ensure exhaustiveness and dead code analysis (e.g. `parmatch.ml`, `clflags.warn_error`) treat `never` as non-fallthrough so we avoid double warnings. +- Populate the new node with a freshly created type variable (mirroring how `%raise` is typed today) so downstream phases treat it as non-returning without introducing a bespoke primitive type. + - Emit appropriate errors on context or payload mismatches. + - Keep `type_statement` warning behaviour intact so `return` inherits the existing `Warnings.Nonreturning_statement` flow (`compiler/ml/typecore.ml:3884-3894`). +- If the type-variable approach proves insufficient, adding an explicit bottom constructor would require touching `Types.type_desc` plus `btype.ml`, `ctype.ml`, `predef.ml`, and the printers in `printtyp.ml`, but the current pipeline already models non-returning code via `Tvar`. +- Ensure exhaustiveness and dead code analysis (e.g. `compiler/ml/parmatch.ml`, `compiler/ext/warnings.ml`) treat `return` as non-fallthrough so we avoid double warnings. - Update typed tree iterators and printers (`TypedtreeIter`, `Printtyped`) to handle `Texp_return`. ## Lambda IR Translation (`compiler/core/`) @@ -50,7 +49,7 @@ Developers regularly ask for a lightweight way to exit a function before its fin ## JavaScript Backend (`compiler/core/js_*`) - Update JS lowering (`lam_compile.ml`, `js_output.ml`) so lambda outputs marked as “finished” get converted to `return_stmt payload` and no additional implicit return is appended. - Ensure `switch`/`if` lowering avoids emitting duplicate `return` statements when a branch already ends with `return`. This likely relies on `output_finished = True` plumbing already used by `throw` and existing returns. -- Adjust `js_stmt_make` / `js_exp_make` to expose helper constructors where needed, and audit passes like `js_pass_flatten.ml` to respect terminating statements. +- Adjust `js_stmt_make` / `js_exp_make` to expose helper constructors where needed, and audit passes like `js_pass_flatten_and_mark_dead.ml` to respect terminating statements. - Validate async helpers and promise sugar to confirm the generated functions contain direct `return` statements, ensuring semantics match JavaScript. ## Tooling & Diagnostics @@ -69,7 +68,12 @@ Developers regularly ask for a lightweight way to exit a function before its fin - **Lambda / JS IR tests**: add golden-print tests verifying `Lreturn` in `lam_print` and generated JS blocks for representative cases (`if`, `switch`, `try/finally`, async wrappers). - **Integration tests** (`tests/build_tests/`): demonstrate runtime behaviour, including interaction with promise helpers and exceptions. +## Existing Unreachable Code Handling +- **Typechecker warnings**: `type_statement` warns with `Warnings.Nonreturning_statement` whenever an expression typed as a bare `Tvar` is discarded (`compiler/ml/typecore.ml:3884-3894`), which is how `%raise` communicates non-returning behaviour today. +- **Pattern reachability**: `Parmatch.check_unused` emits `Warnings.Unreachable_case` for dead match arms and already runs for every `Texp_match`/`Texp_function` (`compiler/ml/parmatch.ml:2158-2201`). +- **Backend pruning**: `%raise` lowers to `Lprim (Praise, …)` in `translcore` (`compiler/ml/translcore.ml:738-745`). The JS backend recognises that primitive and marks the output as finished (`compiler/core/lam_compile.ml:1540-1560`), and `Js_output.append_output` drops any subsequent statements when `output_finished = True` (`compiler/core/js_output.ml:82-138`). A future `return` node should reuse this plumbing so dead statements are automatically discarded without a new bottom type. + ## Open Questions & Follow-ups -- Does the compiler already expose a notion of bottom in other phases? If so, integrate rather than re-invent; otherwise, ensure `never` is threaded consistently (e.g. into `Predef` and external tooling). +- The compiler already models non-returning expressions via fresh type variables plus warning logic (`compiler/ml/typecore.ml:3884-3894`) and by marking backend outputs as finished (`compiler/core/lam_compile.ml:1540-1560`, `compiler/core/js_output.ml:117-138`). Reuse that machinery for `return` before introducing a dedicated `never` constructor. - Validate how `return` reads inside pipeline-heavy expressions; current proposal allows it everywhere, but we should document guidance if certain patterns feel awkward. - Consider introducing linting guidance to discourage overuse in expression-heavy code while still allowing pragmatic escapes. From 14875120d223eebe722841d63f1fa8c389549009 Mon Sep 17 00:00:00 2001 From: Christoph Knittel Date: Thu, 2 Oct 2025 09:10:34 +0200 Subject: [PATCH 4/4] Some updates after Codex investigating prior art --- docs/early-return-design.md | 25 ++++++++++++++++++++++++- 1 file changed, 24 insertions(+), 1 deletion(-) diff --git a/docs/early-return-design.md b/docs/early-return-design.md index a59bb0596a..1414b68a80 100644 --- a/docs/early-return-design.md +++ b/docs/early-return-design.md @@ -73,7 +73,30 @@ Developers regularly ask for a lightweight way to exit a function before its fin - **Pattern reachability**: `Parmatch.check_unused` emits `Warnings.Unreachable_case` for dead match arms and already runs for every `Texp_match`/`Texp_function` (`compiler/ml/parmatch.ml:2158-2201`). - **Backend pruning**: `%raise` lowers to `Lprim (Praise, …)` in `translcore` (`compiler/ml/translcore.ml:738-745`). The JS backend recognises that primitive and marks the output as finished (`compiler/core/lam_compile.ml:1540-1560`), and `Js_output.append_output` drops any subsequent statements when `output_finished = True` (`compiler/core/js_output.ml:82-138`). A future `return` node should reuse this plumbing so dead statements are automatically discarded without a new bottom type. +## Prior Art +- **Rust** + - **Syntax**: supports `return expr;` and bare `return;` alongside the idiomatic “last expression wins” rule. + - **Semantics**: the `return` expression has the bottom type `!`, so type inference and control-flow analysis mark all following code as unreachable. The same machinery covers other diverging constructs (`loop {}`, `panic!`), letting borrow checking and MIR optimisations short-circuit safely. + - **Interoperability**: because Rust targets native back-ends, early return is compiled to direct jumps, proving the pattern fits expression-oriented languages that still value low-level control. +- **Kotlin** + - **Syntax**: `return expr` exits the current function; `return@label expr` exits a labeled lambda or loop, preserving expression-based APIs such as `run { … }` and collection pipelines. + - **Semantics**: `return` yields the bottom type `Nothing`. Smart casts and exhaustiveness checking treat `Nothing` as terminating, so unreachable code is rejected and type inference remains precise. + - **Diagnostics**: Kotlin’s flow analysis creates “dead code” warnings immediately after a `return`, demonstrating the value of threading bottom types through the checker. +- **Scala** + - **Syntax**: `return expr` returns from the nearest named method; it is legal inside expression bodies but not idiomatic. + - **Semantics**: the return expression has type `Nothing`, Scala’s bottom type. Inside anonymous functions the compiler lowers `return` to throwing `NonLocalReturnControl`, which highlights surprising control flow when the target is not obvious. + - **Lesson**: ReScript should explicitly specify whether `return` is allowed in closures and, if so, how it interacts with captured continuations to avoid Scala’s non-local return pitfalls. +- **Swift** + - **Syntax**: `return expr` is required unless the function consists of a single expression; `guard … else { return }` is a first-class use-case. + - **Semantics**: Swift’s `Never` bottom type represents non-returning code. Diagnose unreachable statements immediately after `return`, and type inference propagates `Never` through `guard`/`switch` constructs. + - **Interop**: Because Swift targets multiple back-ends (including JS through SwiftWasm), it shows early return maps cleanly to JavaScript code generation. +- **TypeScript / JavaScript** + - **Syntax**: `return expr;` is a statement. TypeScript adds inference of the bottom type `never` for functions that always return or throw, feeding its exhaustiveness checking and control-flow narrowing. + - **Semantics**: even without expression syntax, TypeScript’s `never` demonstrates the benefit of a bottom type for tooling and editor diagnostics—something ReScript can leverage while preserving JS parity. +- **Swift / Rust Hybrids vs ML lineage** + - OCaml, Standard ML, Elm, Haskell, and F# avoid early return altogether, relying on expression composition or exceptions. This contrast underlines that adopting `return` aligns ReScript with Rust/Kotlin ergonomics rather than traditional ML style, but also that we can reuse ML-derived analyses if we thread a bottom type carefully. + ## Open Questions & Follow-ups -- The compiler already models non-returning expressions via fresh type variables plus warning logic (`compiler/ml/typecore.ml:3884-3894`) and by marking backend outputs as finished (`compiler/core/lam_compile.ml:1540-1560`, `compiler/core/js_output.ml:117-138`). Reuse that machinery for `return` before introducing a dedicated `never` constructor. +- The compiler already models non-returning expressions via fresh type variables plus warning logic (`compiler/ml/typecore.ml:3884-3894`) and by marking backend outputs as finished (`compiler/core/lam_compile.ml:1540-1560`, `compiler/core/js_output.ml:117-138`). Reuse that machinery for `return` before introducing a dedicated `never` constructor—but note that every language we surveyed leans on an explicit bottom type (`!`, `Nothing`, `Never`, `never`) to make control-flow reasoning robust. - Validate how `return` reads inside pipeline-heavy expressions; current proposal allows it everywhere, but we should document guidance if certain patterns feel awkward. - Consider introducing linting guidance to discourage overuse in expression-heavy code while still allowing pragmatic escapes.