-
Notifications
You must be signed in to change notification settings - Fork 1.2k
JSONSerialization: Use NSDecimalNumber for parsing json numbers. #1655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Replace strol()/strtod() parsing with Scanner.scanDecimal(). This allows parsing of Integers > Int64.max (upto UInt64.max) without losing information due to the fallback parsing provided by parsing as a Double.
|
@swift-ci please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about three main things here:
-
The cost of parsing as
NSDecimalNumberevery time, especially by spinning up a newScannerfor every single read number -
Potential new differences in precision and representation. I haven't followed the implementation of
NSDecimalNumberon swift-corelibs-foundation, but is its implementation significantly different fromNSDecimalNumberon Darwin? Because on Darwin,NSDecimalNumberhas a significantly lower possible magnitude thanDoubledoes (IIRC 10^±127 vs. 10^±308), with the tradeoff being that it allows for higher precision. On Darwin, we prefer to parse as aDoubleif the value fits both within the magnitude and precision of aDouble— the C standard guarantees this to be true if the string representation of the mantissa is less thanDBL_DECIMAL_DIGin length, so this is what we check. If we cannot fit the number in aDoublewithout loss of precision, then we preferNSDecimalNumberif the value fits within its magnitude; otherwise we're willing to lose precision to fit in aDouble. If the number is too large to fit in either, we bail. Here we'd be giving up on the more efficient representation in anInt/Double, along with the greater range thatDoubleprovides (unless I'm missing something andNSDecimalNumberis more capable here than it is on Darwin) -
@millenomi — this one is more for you. With all of the bridging work you did, do we have any additional testing/coverage for
NSDecimalNumberand how that works out? Folks probably have a lot of code which currently checksas? Intandas? Doublesince for a long time those were the only two supported options here — I'd hate for anNSDecimalNumberto accidentally slip through with loss of precision or representation or something. Even so, the representation inNSDecimalNumberis not good enough forInts. On Darwin, the following code printsnildue to loss of precision making the number no longer "cleanly" representable in anInt:let dec = Decimal(1123123123) let num = NSDecimalNumber(decimal: dec) let i = (num as NSNumber) as? Int print(i)
Overall I think the changes here are good in spirit but we'll likely need to maintain some of the existing representations we already have unless my concerns are unfounded due to my lack of experience here (which I would happily accept!). What we might be able to do is transition over to more of what we have on Darwin — prefer Double unless we can get away with an NSDecimalNumber.
| // Validate the first few characters look like a JSON encoded number: | ||
| // Optional '-' sign at start only 1 leading zero if followed by a decimal point. | ||
| var index = input | ||
| func nextDigit() -> UInt8? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this more clearly be called nextASCII()/nextCharacter()/similar?
| let MINUS = UInt8(ascii: "-") | ||
|
|
||
| func parseTypedNumber(_ string: String) throws -> (Any, IndexDistance)? { | ||
| let scan = Scanner(string: string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth testing the performance of creating a new Scanner for every single number we try to parse — I suspect that this is a potentially prohibitively expensive operation.
|
Note also that this conflicts with the work in #1654 so we might need to coordinate something here. |
|
@itaiferber Thanks for your feedback. I hadn't actually considered the cost of the As to the difference between As to the solution in #1654, I did originally implement that same solution and there are a number of issues with it: Firstly it can involve upto 3 separate string parses. Also care must be taken with As for the bridging issues, my tests showed up some problems but these should be resolved in #1653 . However thinking about this problem further, I realised the real issue is that we parse the string before knowing how we want it use it. So I came up with the following solution which is to store the number as a This would ensure that full JSON number validation is performed, which currently doesn't happen, and the value is only parsed once using the correct parser. Does this seem reasonable or am missing anything here? |
|
@spevans I was quickly putting together an approach similar to #1654 — I agree that it can involve up to 3 parses but they should be relatively short and quick as the input is limited in length. Otherwise, though: how do we know how we want to use a parsed number? Aren't these values just returned to the developer directly? Or are you proposing a further deferral that will only parse the value once it's read and cast out (e.g. a number that totally defers evaluation until you try to access it)? (If so, it doesn't necessarily improve things as it's totally valid to write |
|
I was thinking out not parsing until its accessed and with Also, if it is a subclass of |
|
FYI, here is the version I had using the |
|
Closed in favour of #1657 |
Replace strol()/strtod() parsing with Scanner.scanDecimal().
This allows parsing of Integers > Int64.max (upto UInt64.max)
without losing information due to the fallback parsing provided
by parsing as a Double.
Also allows parsing larger integers into a Decimal.
Requires PR #1653 for some
NSDecimalNumberandNSNumberfixes