Skip to content

Conversation

@kazuyukitanimura
Copy link
Contributor

What changes were proposed in this pull request?

This PR fixes inaccurate Decimal multiplication and division results.

Why are the changes needed?

Decimal multiplication and division results may be inaccurate due to rounding issues.

Multiplication:

scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+                          
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+

The correct answer is 6568635674732509803675414794505.574763

Please note that the last digit is 3 instead of 4 as

scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644

Since the factional part .574763 is followed by 4644, it should not be rounded up.

Division:

scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+

The correct answer is -3.237520E-31

Please note that the last digit is 0 instead of 1 as

scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31

Since the factional part .237520 is followed by 4894..., it should not be rounded up.

Does this PR introduce any user-facing change?

Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with org.apache.spark.sql.types.Decimal() (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

How was this patch tested?

Test added

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Nov 6, 2023
@HyukjinKwon
Copy link
Member

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making a PR, @kazuyukitanimura .

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you fix the UT failure, @kazuyukitanimura ?

[info] *** 1 TEST FAILED ***
[error] Failed: Total 3196, Failed 1, Errors 0, Passed 3195, Ignored 3
[error] Failed tests:
[error] 	org.apache.spark.sql.SQLQueryTestSuite

@kazuyukitanimura kazuyukitanimura marked this pull request as ready for review November 7, 2023 13:29
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.

cc @cloud-fan , too

dongjoon-hyun pushed a commit that referenced this pull request Nov 7, 2023
… results

### What changes were proposed in this pull request?
This PR fixes inaccurate Decimal multiplication and division results.

### Why are the changes needed?
Decimal multiplication and division results may be inaccurate due to rounding issues.
#### Multiplication:
```
scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`

Please note that the last digit is `3` instead of `4` as

```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.

#### Division:
```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`

Please note that the last digit is `0` instead of `1` as

```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.

### Does this PR introduce _any_ user-facing change?
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

### How was this patch tested?
Test added

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43678 from kazuyukitanimura/SPARK-45786.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 5ef3a84)
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Nov 7, 2023
… results

### What changes were proposed in this pull request?
This PR fixes inaccurate Decimal multiplication and division results.

### Why are the changes needed?
Decimal multiplication and division results may be inaccurate due to rounding issues.
#### Multiplication:
```
scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`

Please note that the last digit is `3` instead of `4` as

```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.

#### Division:
```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`

Please note that the last digit is `0` instead of `1` as

```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.

### Does this PR introduce _any_ user-facing change?
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

### How was this patch tested?
Test added

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43678 from kazuyukitanimura/SPARK-45786.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 5ef3a84)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to master/3.5/3.4. Thank you, @kazuyukitanimura .

Could you make a backporting PR to branch-3.3 too?

@kazuyukitanimura
Copy link
Contributor Author

Thank you all

Could you make a backporting PR to branch-3.3 too?

I will @dongjoon-hyun

kazuyukitanimura added a commit to kazuyukitanimura/spark that referenced this pull request Nov 7, 2023
… results

This PR fixes inaccurate Decimal multiplication and division results.

Decimal multiplication and division results may be inaccurate due to rounding issues.
```
scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`

Please note that the last digit is `3` instead of `4` as

```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.

```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`

Please note that the last digit is `0` instead of `1` as

```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.

Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

Test added

No

Closes apache#43678 from kazuyukitanimura/SPARK-45786.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
}
}

test("SPARK-45786: Decimal multiply, divide, remainder, quot") {
Copy link
Contributor

@LuciferYang LuciferYang Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will failed when spark.sql.ansi.enabled

https://github.com/apache/spark/actions/runs/6885072758/job/18728675619

image

You can reproduce the issue locally by executing SPARK_ANSI_SQL_MODE=true build/sbt clean "catalyst/testOnly org.apache.spark.sql.catalyst.expressions.ArithmeticExpressionSuite"

@kazuyukitanimura Can you take a look at this issue?

also cc @dongjoon-hyun Since this patch has been backported to branch-3.4, I'm not sure if this will affect the version release of Spark 3.4.2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LuciferYang
Yes, this test is assuming the default spark.sql.ansi.enabled=false. The default behavior does not throw the exception for overflows, but Ansi mode does. Since this is a random value test, we may have combinations that overflows.

Cause: org.apache.spark.SparkArithmeticException: [NUMERIC_VALUE_OUT_OF_RANGE] 431393072276642444045219979063553045.571 cannot be represented as Decimal(38, 4). If necessary set "spark.sql.ansi.enabled" to "false" to bypass this error, and return NULL instead. SQLSTATE: 22003

Sorry that I wasn't aware that there is a GHA for spark.sql.ansi.enabled=true. I can modify the test to ignore those cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed #43853

dongjoon-hyun pushed a commit that referenced this pull request Nov 17, 2023
…th ANSI enabled

### What changes were proposed in this pull request?
This follow-up PR fixes the test for SPARK-45786 that is failing in GHA with SPARK_ANSI_SQL_MODE=true

### Why are the changes needed?
The issue discovered in #43678 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test updated

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43853 from kazuyukitanimura/SPARK-45786-FollowUp.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Nov 17, 2023
…th ANSI enabled

### What changes were proposed in this pull request?
This follow-up PR fixes the test for SPARK-45786 that is failing in GHA with SPARK_ANSI_SQL_MODE=true

### Why are the changes needed?
The issue discovered in #43678 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test updated

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43853 from kazuyukitanimura/SPARK-45786-FollowUp.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 949de34)
Signed-off-by: Dongjoon Hyun <[email protected]>
dongjoon-hyun pushed a commit that referenced this pull request Nov 17, 2023
…th ANSI enabled

### What changes were proposed in this pull request?
This follow-up PR fixes the test for SPARK-45786 that is failing in GHA with SPARK_ANSI_SQL_MODE=true

### Why are the changes needed?
The issue discovered in #43678 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test updated

### Was this patch authored or co-authored using generative AI tooling?
No

Closes #43853 from kazuyukitanimura/SPARK-45786-FollowUp.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 949de34)
Signed-off-by: Dongjoon Hyun <[email protected]>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
… results

### What changes were proposed in this pull request?
This PR fixes inaccurate Decimal multiplication and division results.

### Why are the changes needed?
Decimal multiplication and division results may be inaccurate due to rounding issues.
#### Multiplication:
```
scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`

Please note that the last digit is `3` instead of `4` as

```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.

#### Division:
```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`

Please note that the last digit is `0` instead of `1` as

```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.

### Does this PR introduce _any_ user-facing change?
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

### How was this patch tested?
Test added

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#43678 from kazuyukitanimura/SPARK-45786.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 5ef3a84)
Signed-off-by: Dongjoon Hyun <[email protected]>
szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Feb 7, 2024
…th ANSI enabled

### What changes were proposed in this pull request?
This follow-up PR fixes the test for SPARK-45786 that is failing in GHA with SPARK_ANSI_SQL_MODE=true

### Why are the changes needed?
The issue discovered in apache#43678 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test updated

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#43853 from kazuyukitanimura/SPARK-45786-FollowUp.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 949de34)
Signed-off-by: Dongjoon Hyun <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
… results (apache#358)

* [SPARK-45786][SQL] Fix inaccurate Decimal multiplication and division results

### What changes were proposed in this pull request?
This PR fixes inaccurate Decimal multiplication and division results.

### Why are the changes needed?
Decimal multiplication and division results may be inaccurate due to rounding issues.
#### Multiplication:
```
scala> sql("select  -14120025096157587712113961295153.858047 * -0.4652").show(truncate=false)
+----------------------------------------------------+
|(-14120025096157587712113961295153.858047 * -0.4652)|
+----------------------------------------------------+
|6568635674732509803675414794505.574764              |
+----------------------------------------------------+
```
The correct answer is `6568635674732509803675414794505.574763`

Please note that the last digit is `3` instead of `4` as

```
scala> java.math.BigDecimal("-14120025096157587712113961295153.858047").multiply(java.math.BigDecimal("-0.4652"))
val res21: java.math.BigDecimal = 6568635674732509803675414794505.5747634644
```
Since the factional part `.574763` is followed by `4644`, it should not be rounded up.

#### Division:
```
scala> sql("select -0.172787979 / 533704665545018957788294905796.5").show(truncate=false)
+-------------------------------------------------+
|(-0.172787979 / 533704665545018957788294905796.5)|
+-------------------------------------------------+
|-3.237521E-31                                    |
+-------------------------------------------------+
```
The correct answer is `-3.237520E-31`

Please note that the last digit is `0` instead of `1` as

```
scala> java.math.BigDecimal("-0.172787979").divide(java.math.BigDecimal("533704665545018957788294905796.5"), 100, java.math.RoundingMode.DOWN)
val res22: java.math.BigDecimal = -3.237520489418037889998826491401059986665344697406144511563561222578738E-31
```
Since the factional part `.237520` is followed by `4894...`, it should not be rounded up.

### Does this PR introduce _any_ user-facing change?
Yes, users will see correct Decimal multiplication and division results.
Directly multiplying and dividing with `org.apache.spark.sql.types.Decimal()` (not via SQL) will return 39 digit at maximum instead of 38 at maximum and round down instead of round half-up

### How was this patch tested?
Test added

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#43678 from kazuyukitanimura/SPARK-45786.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 5ef3a84)
Signed-off-by: Dongjoon Hyun <[email protected]>

* [SPARK-45786][SQL][FOLLOWUP][TEST] Fix Decimal random number tests with ANSI enabled

### What changes were proposed in this pull request?
This follow-up PR fixes the test for SPARK-45786 that is failing in GHA with SPARK_ANSI_SQL_MODE=true

### Why are the changes needed?
The issue discovered in apache#43678 (comment)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Test updated

### Was this patch authored or co-authored using generative AI tooling?
No

Closes apache#43853 from kazuyukitanimura/SPARK-45786-FollowUp.

Authored-by: Kazuyuki Tanimura <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 949de34)
Signed-off-by: Dongjoon Hyun <[email protected]>

---------

Signed-off-by: Dongjoon Hyun <[email protected]>
Co-authored-by: Kazuyuki Tanimura <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants