-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-45786][SQL][3.3] Fix inaccurate Decimal multiplication and division results #43705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45786][SQL][3.3] Fix inaccurate Decimal multiplication and division results #43705
Conversation
… results
This PR fixes inaccurate Decimal multiplication and division results.
Decimal multiplication and division results may be inaccurate due to rounding issues.
```
scala> sql("select -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764 |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`
Please note that the last digit is `3` instead of `4` as
```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.
```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31 |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`
Please note that the last digit is `0` instead of `1` as
```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up
Test added
No
Closes apache#43678 from kazuyukitanimura/SPARK-45786.
Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
|
Thank you, @kazuyukitanimura ! |
|
Should type coercion consider rounding? I think there might be confusion on what @kazuyukitanimura would you get the result you want if your forced the result as an explicit cast to the default DecimalType |
|
Thanks @jcdang
Did you mean like This PR changes not to round with MathContext. Hopefully this clarifies. Some comments are here https://github.com/apache/spark/pull/43705/files#diff-87807d437248d04876eac9e116a527577f1a8e53e28337bf26423e5bf94630e1R567-R571 |
|
This is a backport of #43678 but Spark 3.3 decimal works differently from 3.4, need to update the tests more. |
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you want to proceed this, @kazuyukitanimura ?
|
In any way, Apache Spark 3.3 will reach the end of life in two weeks.
If you don't have enough time, we close this anyway, @kazuyukitanimura . |
|
Thanks @dongjoon-hyun closing |
|
Thank you for the decision. |
What changes were proposed in this pull request?
This PR fixes inaccurate Decimal multiplication and division results.
Why are the changes needed?
Decimal multiplication and division results may be inaccurate due to rounding issues.
Multiplication:
The correct answer is
6568635674732509803675414794505.574763Please note that the last digit is
3instead of4asSince the factional part
.574763is followed by4644, it should not be rounded up.Division:
The correct answer is
-3.237520E-31Please note that the last digit is
0instead of1asSince the factional part
.237520is followed by4894..., it should not be rounded up.Does this PR introduce any user-facing change?
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with
org.apache.spark.sql.types.Decimal()(not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-upHow was this patch tested?
Test added
Was this patch authored or co-authored using generative AI tooling?
No