Skip to content

Conversation

@willstott101
Copy link
Contributor

@willstott101 willstott101 commented Nov 17, 2022

My motivations here are to learn about, and ideally improve the performance of, rust-based frontend tooling.

Behaviour Changes

The regex crate is used in two separate parts of this lib, in both cases I have ended up making behavioural changes - which I can undo to be more compatible if desired.

  • filename: &str -> index: usize parsing in code managing compressed source bundles
    • No longer are any files in the bundle directory ignored, they must all be understood and used by the bundler.
    • I explored a few different options here: willstott101@c6d1deb
  • Finding and introspecting JavaScript identifiers
    • is_valid_javascript_identifier no longer returns true for invalid identifiers with valid prefixes (see commented test cases)

unicode-id

I went ahead and used unicode-id as I stole the is_valid_start and is_valid_continue functions from swc@5a23949f swc_ecma_ast/src/ident.rs#L180 and unicode-id is what swc uses.

Measurements

Before After Notes
Test run (cargo test stdout report) ~1.0s ~0.7s Bare in mind that I added more tests
Compilation from scratch ~14.5s ~11.5s cargo build --release --examples --offline
read example size 4770336 4761136 - 9200 bytes
rewrite example size 4878184 4864792 - 13392 bytes

Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the tokenization was used in our old "find function name" heuristic which we are replacing with properly parsing the minified source in https://github.com/getsentry/js-source-scopes.

I would be very happy to just remove all that logic in a new breaking release if @mitsuhiko agrees.

@mitsuhiko
Copy link
Contributor

If we end up killing the old function name heuristic (which for what it’s worth I’m absolutely fine with) we probably at least want to point people towards an example of how this is best done instead. We could potentially even have a highly inefficient way here to optionally (behind a feature flag) use the scope crate.

@willstott101
Copy link
Contributor Author

Yeah the use of this identifier stuff is purely within get_original_function_name. Trying to use js-source-scopes within here might be a bit confusing though, as that crate appears to depend on this one...

@Swatinem
Copy link
Member

yes, IMO this crate should be limited in scope to only sourcemaps. extracting function names (scope information) is a different concern.

I think we can eventually remove that stuff in a breaking release.

@Swatinem Swatinem merged commit 5187edf into getsentry:master Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants