Skip to content

Conversation

@mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Aug 3, 2019

What changes were proposed in this pull request?

If spark.sql.arithmeticOperations.failOnOverFlow is true temporary overflows in sum of long values cause an exception. This is against what other RDBMS do. The root cause is that we use a long attribute to store the intermediate result of a sum of longs: so if there is an intermediate value which is our of the range representable by long, an exception is raised.

The PR introduces a flag which allows to control the datatype of the intermediate attribute of sum of long. When the flag is set to true, intermediate buffer is a decimal, so temporary overflows don't cause any issue. The flag has been introduced on @cloud-fan 's suggestion because as he pointed out, using always a decimal as intermediate buffer would cause a performance regression.

How was this patch tested?

Added UT

@SparkQA
Copy link

SparkQA commented Aug 3, 2019

Test build #108605 has finished for PR 25347 at commit af78b62.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@xuanyuanking xuanyuanking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for me, cc @cloud-fan for a double-check.
How about link this comment from you and Wenchen to the PR description?

@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 4, 2019

thanks for the suggestion @xuanyuanking , I added the link!

@SparkQA
Copy link

SparkQA commented Aug 4, 2019

Test build #108620 has finished for PR 25347 at commit e81e8fa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Aug 4, 2019

Do we need similar handing for average?

@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 4, 2019

@kiszk no, since the buffer value for average is a double for longs, so there is not the same issue.

@kiszk
Copy link
Member

kiszk commented Aug 4, 2019

@mgaido91 I see. It looks truncation during double sum for average is expected, as described here. #21599 throws overflow only for integral arithmetic operations.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good by me too.

@maropu
Copy link
Member

maropu commented Aug 5, 2019

Is this common handling for long sum in DBMSs? In postgresql and mysql, it seems that the output type is decimal and the answer is correct;

// PostgreSQL
postgres=# \d t
                Table "public.t"
 Column |  Type  | Collation | Nullable | Default 
--------+--------+-----------+----------+---------
 l      | bigint |           |          | 

postgres=# select * from t;
          l          
---------------------
 9223372036854775807
                   1
(2 rows)

postgres=# select sum(l) from t;
         sum         
---------------------
 9223372036854775808
(1 row)

postgres=# create temporary table v as select sum(l) from t;
postgres=# \d v
               Table "pg_temp_3.v"
 Column |  Type   | Collation | Nullable | Default 
--------+---------+-----------+----------+---------
 sum    | numeric |           |          | 


// MySQL
mysql> explain t;
+-------+------------+------+-----+---------+-------+
| Field | Type       | Null | Key | Default | Extra |
+-------+------------+------+-----+---------+-------+
| l     | bigint(20) | YES  |     | NULL    |       |
+-------+------------+------+-----+---------+-------+
1 row in set (0.00 sec)

mysql> select * from t;
+---------------------+
| l                   |
+---------------------+
| 9223372036854775807 |
|                   1 |
+---------------------+
2 rows in set (0.00 sec)

mysql> select sum(l) from t;
+---------------------+
| sum(l)              |
+---------------------+
| 9223372036854775808 |
+---------------------+
1 row in set (0.00 sec)

mysql> create temporary table v as select sum(l) from t;
mysql> explain v;
+--------+---------------+------+-----+---------+-------+
| Field  | Type          | Null | Key | Default | Extra |
+--------+---------------+------+-----+---------+-------+
| sum(l) | decimal(41,0) | YES  |     | NULL    | NULL  |
+--------+---------------+------+-----+---------+-------+
1 row in set (0.01 sec)

@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 6, 2019

Yes @maropu, you're right. The reason why I didn't change the output attribute was not to cause a breaking change. But since we are introducing a flag for it, it may be ok to do so. What do you think? cc @cloud-fan what do you think about this?

@maropu
Copy link
Member

maropu commented Aug 7, 2019

Yea, I think so. If we have a new flag, IMO it'd be better that spark has the same behivour with the others.

@SparkQA
Copy link

SparkQA commented Aug 8, 2019

Test build #108807 has finished for PR 25347 at commit 5738511.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@maropu good point. I think it's more than overflow now. Maybe we should create a config to specify the dialect, and provide 2 options: legacy and ansi. For ansi dialect, we try out best to make the behavior of Spark be consistent with ansi sql standard, including parser, overflow, sum data type, etc.

@mgaido91
Copy link
Contributor Author

thanks for your comments @cloud-fan and @maropu.

I agree long term to have coarser flags controlling all these things, like: ansi strict, legacy and so on. So far we are creating a new flag for each case as it is done now in this PR.

I am not sure if your comments say that we should hold this until we create such flags or we can go ahead on this as it is and create those flags later. May you please clarify?

Thanks.

@cloud-fan
Copy link
Contributor

We can create the coarser flag later.

For this PR, in general LGTM, but let's make the story about sum completed. Is it true that, in the mainstream databases, sum of long returns decimal and sum of other integral types return long? Or do they always return decimal for sum?

@mgaido91
Copy link
Contributor Author

Well, actually mainstream DBs behave differently among each other:

  • SQLServer returns bigint for a sum of bigint, int for a sum of int, decimal(28,s) for a sum of decimal(p,s);
  • MySQL returns decimal(38,0) for a sum of int, decimal(41, 0) for a sum of long;
  • Postgres behaves like after this PR with the flag on, ie. a sum of integers returns a long, while a sum of longs returns a decimal.

@maropu
Copy link
Member

maropu commented Aug 20, 2019

SQLServer throws an exception when results overflowed?
@wangyum Do you know a Oracle behaviour for that?

@wangyum
Copy link
Member

wangyum commented Aug 21, 2019

PostgreSQL:

postgres=# create table t1(c1 bigint);
CREATE TABLE
postgres=# insert into t1 values(9223372036854775807), (9223372036854775807);
INSERT 0 2
postgres=# select sum(c1) from t1;
         sum
----------------------
 18446744073709551614
(1 row)

db2:

[db2inst1@2f3c821d36b7 ~]$ db2 "create table t1(c1 bigint)"
DB20000I  The SQL command completed successfully.
[db2inst1@2f3c821d36b7 ~]$ db2 "insert into t1 values(9223372036854775807), (9223372036854775807)"
DB20000I  The SQL command completed successfully.
[db2inst1@2f3c821d36b7 ~]$ db2 "select sum(c1) from t1"

1
--------------------
SQL0802N  Arithmetic overflow or other arithmetic exception occurred.
SQLSTATE=22003

SQL Server:2019-CTP3.0

1> create table t1(c1 bigint)
2> go
1> insert into t1 values(9223372036854775807), (9223372036854775807)
2> go

(2 rows affected)
1>
1> select sum(c1) from t1
2> go
Msg 8115, Level 16, State 2, Server 8c82b3c03354, Line 1
Arithmetic overflow error converting expression to data type bigint.

Vertica:

dbadmin=> create table t1(c1 bigint);
CREATE TABLE
dbadmin=>  insert into t1 values(9223372036854775807);
 OUTPUT
--------
      1
(1 row)

dbadmin=>  insert into t1 values(9223372036854775807);
 OUTPUT
--------
      1
(1 row)

dbadmin=> select sum(c1) from t1;
ERROR 4845:  Sum() overflowed
HINT:  Try sum_float() instead
dbadmin=> select sum_float(c1) from t1;
      sum_float
----------------------
 1.84467440737096e+19
(1 row)

Oracle:

SQL> -- BIGINT -> NUMBER(19) : https://docs.oracle.com/cd/B19306_01/gateways.102/b14270/apa.htm
SQL> create table t1(c1 NUMBER(19));

Table created.

SQL> insert into t1 values(9223372036854775807);

1 row created.

SQL> insert into t1 values(9223372036854775807);

1 row created.

SQL> select sum(c1) from t1;

   SUM(C1)
----------
1.8447E+19

SQL> create table t2 as select sum(c1) as s from t1;

Table created.

SQL> desc t2;
 Name					   Null?    Type
 ----------------------------------------- -------- ----------------------------
 S						    NUMBER

@maropu
Copy link
Member

maropu commented Aug 21, 2019

Great thanks for the exhaustive survey!! I'm a bit suprised that they have totally different default behaivours.... Oracle truncates the result?

SQL> select sum(c1) from t1;

   SUM(C1)
----------
1.8447E+19

@mgaido91
Copy link
Contributor Author

thanks @wangyum !

Actually I tried your very same command and Oracle returned the exact result to me. Anyway, I don't think it is relevant here. Oracle uses always DECIMAL, it doesn't really have a LONG datatype...

@cloud-fan
Copy link
Contributor

If different databases have different behaviors, then I don't see a strong reason to change the return type of sum in Spark. How about we only use decimal as sum buffer?

@mgaido91
Copy link
Contributor Author

I think the reason is to have feature parity with Postgres. Using it as return type too supports also cases which would overflow otherwise. I think, since we are guarding it with a flag, it is fine to return a decimal and have wider support.

@cloud-fan
Copy link
Contributor

We need to be careful when adding configs to control query semantic. We have a lot of such configs now, but most of them are legacy configs. Legacy configs are OK, since they are not mainly for controlling query semantic, but to provide a way to restore the legacy behavior.

That said, if there is a strong reason to change the return type of sum from long to decimal, let's go for it and add a legacy config. But seems it's not the case here. Spark needs to follow SQL standard but not Postgres, so making Spark behave the same with Postgres is not a strong reason to me.

If the goal is to leverage the ported Postgres tests, I think it's reasonable. But the config should be internal and clearly states that it's only used in the ported Postgres tests.

@mgaido91
Copy link
Contributor Author

Spark needs to follow SQL standard but not Postgres, so making Spark behave the same with Postgres is not a strong reason to me.

I 100% agree on that.

@maropu what do you think?

@maropu
Copy link
Member

maropu commented Aug 22, 2019

Spark needs to follow SQL standard but not Postgres, so making Spark behave the same with Postgres is not a strong reason to me.
I 100% agree on that.

Yea, the opinion looks reasonable to me, too.

How about we only use decimal as sum buffer?

But, I feel a bit uncomfortable to add a new flag only for this sum behaviour change and I personally think most users might not notice this option. The change to support decimal sum buffer itself looks nice to me, but I think we'd like a coarser flag for that.

@mgaido91
Copy link
Contributor Author

The change to support decimal sum buffer itself looks nice to me, but I think we'd like a coarser flag for that.

@maropu @cloud-fan any suggestion on what this flag should be/look like?

Meanwhile I'll update the PR to limit the scope only to the buffer, thanks!

This reverts commit 5738511.
@SparkQA
Copy link

SparkQA commented Aug 22, 2019

Test build #109565 has finished for PR 25347 at commit aae4642.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 22, 2019

Test build #109573 has finished for PR 25347 at commit 7606b5c.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 22, 2019

Test build #109585 has finished for PR 25347 at commit bd27f58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@mgaido91 can you do a simple microbenchmark? If the performance overhead is not significant, we can use decimal as sum buffer and provide a legacy config to use long as buffer.

@mgaido91
Copy link
Contributor Author

I'll do that @cloud-fan , thanks

@mgaido91
Copy link
Contributor Author

mgaido91 commented Aug 27, 2019

@cloud-fan unfortunately the performance overhead is very significant. I tried and run the benchmarks in both modes.

Here you can see the code:

runBenchmark("aggregate without grouping") {
      val N = 500L << 22
      withSQLConf(SQLConf.SUM_DECIMAL_BUFFER_FOR_LONG.key -> "false") {
        codegenBenchmark("agg w/o group long buffer", N) {
          spark.range(N).selectExpr("sum(id)").collect()
        }
      }
      withSQLConf(SQLConf.SUM_DECIMAL_BUFFER_FOR_LONG.key -> "true") {
        codegenBenchmark("agg w/o group decimal buffer", N) {
          spark.range(N).selectExpr("sum(id)").collect()
        }
      }
    }

and here it is the output (as you can see, the overhead is more than 10x on a simple sum of longs):

[info] Running benchmark: agg w/o group long buffer
[info]   Running case: agg w/o group long buffer wholestage off
[info]   Stopped after 2 iterations, 105407 ms
[info]   Running case: agg w/o group long buffer wholestage on
[info]   Stopped after 5 iterations, 6282 ms
[info] 
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
[info] agg w/o group long buffer:                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] agg w/o group long buffer wholestage off          48538          52704         NaN         43,2          23,1       1,0X
[info] agg w/o group long buffer wholestage on            1231           1257          28       1703,6           0,6      39,4X
[info] 
[info] Running benchmark: agg w/o group decimal buffer
[info]   Running case: agg w/o group decimal buffer wholestage off
[info]   Stopped after 2 iterations, 1276890 ms
[info]   Running case: agg w/o group decimal buffer wholestage on
[info]   Stopped after 5 iterations, 496100 ms
[info] 
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-4558U CPU @ 2.80GHz
[info] agg w/o group decimal buffer:             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] agg w/o group decimal buffer wholestage off         633585         638445         NaN          3,3         302,1       1,0X
[info] agg w/o group decimal buffer wholestage on          92037          99220         NaN         22,8          43,9       6,9X

@cloud-fan
Copy link
Contributor

Now I'm wondering if it's worthwhile to sacrifice 10x perf to handle intermedia overflow. It's not about correctness but usability: without this change, queries are more likely to fail with overflow when summing up long values.

So it's a tradeoff between usability and performance. Given it's 10x slowdown, I don't think any users would turn it on by default. If they hit overflow exception later, I think they will just cast the data to decimal before sum. That said, I think it's not very useful to have this feature.

Anyway, many thanks to @mgaido91 for investigating and providing the perf numbers!

@mgaido91
Copy link
Contributor Author

yes, I agree the perf issue is very significant. Let me close this one, thank you all for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants