-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Add float parsing support to std #1958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Allows addition/subtraction of f128 and narrowing casts to f16 from larger float types.
This is not intended to be the long-term implementation as it doesn't provide various properties that we eventually will want (e.g. round-tripping, denormal support). It also uses f64 internally so the wider f128 will be inaccurate.
std/special/compiler_rt/addXf3.zig
Outdated
| @@ -0,0 +1,191 @@ | |||
| // Ported from: | |||
| // | |||
| // https://github.com/llvm-mirror/compiler-rt/blob/92f7768ce940f6437b32ecc0985a1446cd040f7a/lib/builtins/fp_add_impl.inc | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've referenced the git mirror here regarding #1504 since it was easier. Can switch to the svn-id however if there is a preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an official git repository now: https://github.com/llvm/llvm-project
std/fmt/parse_float.zig
Outdated
| d.d2 = @truncate(u32, w); | ||
| } | ||
|
|
||
| fn dump(d: Z96) void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot I left this in. Can remove this tomorrow.
| MinusInf, | ||
| }; | ||
|
|
||
| inline fn isDigit(c: u8) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would an ascii module be valuable? Similar to <ctype.h>. There are some implementations under os/path.zig that perform similar functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that sounds quite reasonable.
std/json.zig
Outdated
| Value{ .Integer = try std.fmt.parseInt(i64, token.slice(input, i), 10) } | ||
| else | ||
| @panic("TODO: fmt.parseFloat not yet implemented"); | ||
| Value{ .Float = std.fmt.parseFloat(f64, token.slice(input, i)) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a parseFloatExact or something similar would be valuable and would not allow trailing characters?
| .mantissa = 0, | ||
| }; | ||
|
|
||
| if (caseInEql(s, "nan")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be worthwhile to add an extra condition here to avoid 4 compares in the non-special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my experience doing anything smart around string comparisons tend to not be worth it as the compiler is pretty good in this area. https://github.com/Hejsil/fun-with-zig/blob/e8f857a401151dbb716acce1775e83c4d1656132/bench/match.zig#L9-L18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Let's measure it!
https://zig.godbolt.org/z/9zC4yU
The slow-case is actually far worse than the 4 I said since there are a lot of branches in the slice equality check. Without the check we have to go through most of the branches for any value 3 or 4 digits in length. Definitely think this is worth it given the float-parsing code is often the bottleneck in high-performance parsing.
std/fmt/parse_float.zig
Outdated
| } | ||
|
|
||
| test "fmt.parseFloat" { | ||
| const assert = std.debug.assert; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use the new testing.expect here instead.
|
I've modified this so |
std/fmt/parse_float.zig
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have testing.expectError for this
std/fmt/parse_float.zig
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have testing.expectEqual for this
std/fmt/parse_float.zig
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we compare against 2 instead of 3? All our cases are >=3 long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we probably want to measure this optimizations impact and leave a comment here explaining why it is worth it do do this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've ended up removing this optimization for the moment. While it definitely is better assembly it doesn't really factor in to the runtime performance due to the current parsing/conversion dominating the runtime anyway.
I'll leave this for down the track when this is optimized specifically for performance instead.
andrewrk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting! I think it's a great goal to one day have non-allocating float parsing that works for all values. Eliminate that entire class of nondeterminism.
|
🎉 should we have an open issue for the particular values that are not parsable yet? Seems like that would be a loose end to tie up before 1.0.0, yeah? |
|
from zig 0.4.0. std.json.Parser on '{"float": 0.7062146892655368}' results in integer overflow. (note not a contrived value, though it looks that way. its from a mastodon public feed json which includes picture aspect ratios as overly specific floats). |
This is derived from http://krashan.ppa.pl/articles/stringtofloat/. it is a fairly simple implementation and it doesn't handle some edge cases but I figure something 95% of the way is better than nothing right now. There is also the question down the line of how easily we can get away without using an allocator since most implementations use a big number implementation for various edge cases but we forgo these cases now so don't need to worry about it just yet.
This also adds some more
f16andf128support to std to support the float parsing requirements in generating specific output types.I haven't thought about the error behavior at all and will look over that before merging this. It does what C does right now and fails with a zero value if nothing could be parsed. We can do better than that.
A discussion over the expected allowed values would be good too. For example, supporting hex-floats may be valuable.
Closes #375.