Skip to content

Conversation

@daurnimator
Copy link
Contributor

Bug fixes, cleaner and faster :)

In my tests, whatever ran first was getting much better numbers.
Additionally, add alignment requirements so that comparison is fair.
Also tested (but not as fast):
```zig
pub fn utf8ByteSequenceLength(first_byte: u8) !u3 {
    const len = @clz(u8, ~first_byte);
    if (len == 0) return 1;
    if (len < 4) return @intcast(u3, len);
    return error.Utf8InvalidStartByte;
}
```
@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Dec 28, 2019
@data-man
Copy link
Contributor

Proposal:

pub const CodePoint = u21;

@daurnimator
Copy link
Contributor Author

Proposal:

pub const CodePoint = u21;

I thought about that but decided I'd like to see how #3806 panned out first.

Also faster, on my machine unicode/throughput_test.zig now gives e.g.
> original utf8ToUtf16Le: elapsed: 1048 ns (0 ms)
> new utf8ToUtf16Le: elapsed: 971 ns (0 ms)
@andrewrk andrewrk merged commit cb02125 into ziglang:master Dec 28, 2019
@andrewrk
Copy link
Member

Nice!

@daurnimator daurnimator deleted the std.unicode-fixes branch December 28, 2019 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

standard library This issue involves writing Zig code for the standard library.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants