Skip to content

Conversation

@Sephiroth-Lin
Copy link
Contributor

@Sephiroth-Lin Sephiroth-Lin commented Jun 6, 2016

What changes were proposed in this pull request?

CREATE TABLE cdr (
  debet_dt          int      ,
  srv_typ_cd        string   ,
  b_brnd_cd         smallint ,
  call_dur          int
)
ROW FORMAT delimited fields terminated by ','
STORED AS TEXTFILE;
SELECT debet_dt,
       SUM(CASE WHEN srv_typ_cd LIKE '0%' THEN call_dur / 60 ELSE 0 END)
FROM cdr
GROUP BY debet_dt
ORDER BY debet_dt;
== Analyzed Logical Plan ==
debet_dt: int, sum(CASE WHEN srv_typ_cd LIKE 0% THEN (call_dur / 60) ELSE 0 END): bigint
Project [debet_dt#16, sum(CASE WHEN srv_typ_cd LIKE 0% THEN (call_dur / 60) ELSE 0 END)#27L]
+- Sort [debet_dt#16 ASC], true
   +- Aggregate [debet_dt#16], [debet_dt#16, sum(cast(CASE WHEN srv_typ_cd#18 LIKE 0% THEN (cast(call_dur#21 as double) / cast(60 as double)) ELSE cast(0 as double) END as bigint)) AS sum(CASE WHEN srv_typ_cd LIKE 0% THEN (call_dur / 60) ELSE 0 END)#27L]
      +- MetastoreRelation default, cdr
SELECT debet_dt,
       SUM(CASE WHEN b_brnd_cd IN(1) THEN call_dur / 60 ELSE 0 END)
FROM cdr
GROUP BY debet_dt
ORDER BY debet_dt;
== Analyzed Logical Plan ==
debet_dt: int, sum(CASE WHEN (CAST(b_brnd_cd AS INT) IN (CAST(1 AS INT))) THEN (CAST(call_dur AS DOUBLE) / CAST(60 AS DOUBLE)) ELSE CAST(0 AS DOUBLE) END): double
Project [debet_dt#76, sum(CASE WHEN (CAST(b_brnd_cd AS INT) IN (CAST(1 AS INT))) THEN (CAST(call_dur AS DOUBLE) / CAST(60 AS DOUBLE)) ELSE CAST(0 AS DOUBLE) END)#87]
+- Sort [debet_dt#76 ASC], true
   +- Aggregate [debet_dt#76], [debet_dt#76, sum(CASE WHEN cast(b_brnd_cd#80 as int) IN (cast(1 as int)) THEN (cast(call_dur#81 as double) / cast(60 as double)) ELSE cast(0 as double) END) AS sum(CASE WHEN (CAST(b_brnd_cd AS INT) IN (CAST(1 AS INT))) THEN (CAST(call_dur AS DOUBLE) / CAST(60 AS DOUBLE)) ELSE CAST(0 AS DOUBLE) END)#87]
      +- MetastoreRelation default, cdr

The only difference is WHEN condition, but will result different output column type(one is bigint, one is double) , so update type coercion order.

How was this patch tested?

Add new UT

@SparkQA
Copy link

SparkQA commented Jun 6, 2016

Test build #60051 has finished for PR 13524 at commit 10b906a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@Sephiroth-Lin Sephiroth-Lin changed the title [SPARK-15776] Type coercion incorrect [SPARK-15776][SQL] Type coercion incorrect Jun 7, 2016
@SparkQA
Copy link

SparkQA commented Jun 10, 2016

Test build #60295 has finished for PR 13524 at commit e396735.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@watermen
Copy link
Contributor

@davies @rxin Can you review the code for this?

@rxin
Copy link
Contributor

rxin commented Jun 12, 2016

Can you put the description in the pull request description? it becomes part of the commit log and would be great to avoid requiring looking up jira.

@Sephiroth-Lin
Copy link
Contributor Author

@rxin Done. Pleas help review, thank you.

DecimalPrecision ::
BooleanEquality ::
StringToIntegralCasts ::
Division ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure: are you moving the Division rule above the FunctionArgumentConversion rule in order to pass correct double arguments to functions? If so, could you add a test for this?

@clockfly
Copy link
Contributor

clockfly commented Jun 13, 2016

@Sephiroth-Lin

I think you can use a simpler case to reproduce this in the description of this PR.

Such as:

select sum(4/3)

The expected result is:

1.33333..

The actual result is:

1

@clockfly
Copy link
Contributor

clockfly commented Jun 14, 2016

This seems is a blocking issue.
I have created a PR at #13651, please help to review.

@rxin
Copy link
Contributor

rxin commented Jun 15, 2016

@Sephiroth-Lin can you close this pull request? Thanks.

asfgit pushed a commit that referenced this pull request Jun 15, 2016
…asted to wrong type

## What changes were proposed in this pull request?

This PR fixes the problem that Divide Expression inside Aggregation function is casted to wrong type, which cause `select 1/2` and `select sum(1/2)`returning different result.

**Before the change:**

```
scala> sql("select 1/2 as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").show()
+---+
|  a|
+---+
|0  |
+---+

scala> sql("select sum(1 / 2) as a").schema
res4: org.apache.spark.sql.types.StructType = StructType(StructField(a,LongType,true))
```

**After the change:**

```
scala> sql("select 1/2 as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").schema
res4: org.apache.spark.sql.types.StructType = StructType(StructField(a,DoubleType,true))
```

## How was this patch tested?

Unit test.

This PR is based on #13524 by Sephiroth-Lin

Author: Sean Zhong <[email protected]>

Closes #13651 from clockfly/SPARK-15776.
asfgit pushed a commit that referenced this pull request Jun 15, 2016
…asted to wrong type

## What changes were proposed in this pull request?

This PR fixes the problem that Divide Expression inside Aggregation function is casted to wrong type, which cause `select 1/2` and `select sum(1/2)`returning different result.

**Before the change:**

```
scala> sql("select 1/2 as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").show()
+---+
|  a|
+---+
|0  |
+---+

scala> sql("select sum(1 / 2) as a").schema
res4: org.apache.spark.sql.types.StructType = StructType(StructField(a,LongType,true))
```

**After the change:**

```
scala> sql("select 1/2 as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").show()
+---+
|  a|
+---+
|0.5|
+---+

scala> sql("select sum(1/2) as a").schema
res4: org.apache.spark.sql.types.StructType = StructType(StructField(a,DoubleType,true))
```

## How was this patch tested?

Unit test.

This PR is based on #13524 by Sephiroth-Lin

Author: Sean Zhong <[email protected]>

Closes #13651 from clockfly/SPARK-15776.

(cherry picked from commit 9bd80ad)
Signed-off-by: Wenchen Fan <[email protected]>
@asfgit asfgit closed this in 1a33f2e Jun 15, 2016
@Sephiroth-Lin Sephiroth-Lin deleted the SPARK-15776 branch June 25, 2016 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants