https://github.com/servo/html5ever/pull/601 added a fastpath for x86 with SSE2. It would be great to have the same optimizations on aarch64 with neon.