Skip to content

Conversation

@nathanmarks
Copy link
Contributor

@josephsavona this was briefly discussed in an old thread, lmk your thoughts on the approach. I have some fixes ready as well but wanted to get this test case in first... there's some things I don't love about this approach, but end of the day it's just a tool for the test suite rather than something for end user folks so even if it does a 70% good enough job that's fine.

refresher on the problem

when we generate coverage reports with jest (istanbul), our coverage ends up completely out of whack due to the AST missing a ton of (let's call them "important") source locations after the compiler pipeline has run.

At the moment to get around this, we've been doing something a bit unorthodox and also running our test suite with istanbul running before the compiler -- which results in its own set of issues (for eg, things being memoized differently, or the compiler completely bailing out on the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start chipping away on as you had recommended.

how it works

The validator basically:

  1. Traverses the original AST and collects the source locations for some "important" node types
    • (excludes useMemo/useCallback calls, as those are stripped out by the compiler)
  2. Traverses the generated AST and looks for nodes with matching source locations.
  3. Generates errors for source locations missing nodes in the generated AST

caveats/drawbacks

There are some things that don't work super well with this approach. A more natural test fit I think would be just having some explicit assertions made against an AST in a test file, as you can just bake all of the assumptions/nuance in there that are difficult to handle in a generic manner. However, this is maybe "good enough" for now.

  1. Have to be careful what you put into the test fixture. If you put in some code that the compiler just removes (for eg, a variable assignment that is unused), you're creating a failure case that's impossible to fix. I added a skip for useMemo/useCallback.
  2. "Important" locations must exactly match for validation to pass.
    • Might get tricky making sure things are mapped correctly when a node type is completely changed, for eg, when a block statement arrow function body gets turned into an implicit return via the body just being an expression/identifier.
    • This can/could result in scenarios where more changes are needed to shuttle the locations through due to HIR not having a 1:1 mapping all the babel nuances, even if some combination of other data might be good enough even if not 10000% accurate. This might be the right thing anyways so we don't end up with edge cases having incorrect source locations.

@meta-cla meta-cla bot added the CLA Signed label Nov 11, 2025
if (t.isIdentifier(callee)) {
return callee.name === 'useMemo' || callee.name === 'useCallback';
}
if (t.isMemberExpression(callee) && t.isIdentifier(callee.property)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets check that the callee is 'React'


// Use Babel's VISITOR_KEYS to traverse only actual node properties
const keys = t.VISITOR_KEYS[node.type as keyof typeof t.VISITOR_KEYS];
if (!keys) return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (keys == null) {
  return
}

Copy link
Member

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me as something we can use internally to improve and test source map coverage. See a couple minor comments, but otherwise looks good. Thanks for the contribution!

@nathanmarks
Copy link
Contributor Author

cool, i'll fix those things then

i have some follow PRs mostly ready to address a few of the lower hanigng fruit gaps, a couple of them are a little hairy but we can dive into that later

@josephsavona josephsavona marked this pull request as ready for review November 13, 2025 03:02
@josephsavona josephsavona merged commit 3a495ae into facebook:main Nov 13, 2025
21 checks passed
github-actions bot pushed a commit that referenced this pull request Nov 13, 2025
@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.

DiffTrain build for [3a495ae](3a495ae)
github-actions bot pushed a commit that referenced this pull request Nov 13, 2025
@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.

DiffTrain build for [3a495ae](3a495ae)
manNomi pushed a commit to manNomi/react that referenced this pull request Nov 15, 2025
@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.
github-actions bot pushed a commit to code/lib-react that referenced this pull request Nov 16, 2025
@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.

DiffTrain build for [3a495ae](facebook@3a495ae)
github-actions bot pushed a commit to code/lib-react that referenced this pull request Nov 16, 2025
@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.

DiffTrain build for [3a495ae](facebook@3a495ae)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants