Skip to content

Conversation

@samyron
Copy link
Contributor

@samyron samyron commented Nov 16, 2025

This PR optimizes json_string_unescape.

Two commits:

  1. Use ARM Neon to scan for \. While scanning, copy the current chunk to the output.
  2. Add a fast path when unescaping a single character.

If this PR is accepted, I will follow up with an SSE2 implementation.

Benchmarks

Run on a Macbook Air M1.

twitterescaped.json is from simdjson-data.

== Parsing activitypub.json (58160 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after     1.103k i/100ms
Calculating -------------------------------------
               after     11.143k (± 0.8%) i/s   (89.74 μs/i) -     56.253k in   5.048516s

Comparison:
              before:    10366.8 i/s
               after:    11143.2 i/s - 1.07x  faster

== Parsing twitterescaped.json (562408 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24]
Warming up --------------------------------------
               after    73.000 i/100ms
Calculating -------------------------------------
               after    737.341 (± 0.9%) i/s    (1.36 ms/i) -      3.723k in   5.049667s

Comparison:
              before:      712.1 i/s
               after:      737.3 i/s - 1.04x  faster

I should note that the fast path for unescaping a single character accounts for about 1% of the speed increase in activitypub.json. It's pretty minor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant