-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8010][SQL]Promote types to StringType as implicit conversion in non-binary expression of HiveTypeCoercion #6551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@chenghao-intel Could u please review this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numericPrecedence doesn't contain StringType, you need to write a separate case like case (t1 @ StringType(), t2 @ NumericType()).
|
|
|
It's better to update the rule for Beside, can you also add a unit test to compare the hive answer? |
|
Thanks, @cloud-fan and @chenghao-intel. I will update it tomorrow night. |
|
@cloud-fan and @chenghao-intel. Updated, Any more comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you compare the result with Hive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And probably it's better to move the test into https://github.com/OopsOutOfMemory/spark/blob/pnts/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
|
@cloud-fan and @chenghao-intel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about ArrayType, StructType, etc.? Does hive has a document for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, @cloud-fan
you can refer the chart at the bottom of page in https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types
This is only for primitive type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think we should add t2.isInstanceOf[AtomicType] to make sure it's primitive type.
|
@cloud-fan That's make sense, thanks. Updated. |
|
ping.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not make this a function instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and make it private
|
@cloud-fan and @yhuai can you both take another look at this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @rxin , findTightestCommonTypeOfTwo is a function and I think we should make them consistent...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cloud-fan I'm sorry, I don't quite understand when to use a function or when to use a method. And what's the purpose here to use a function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we don't care about the parameter names(i.e. t1 and t2). Using functions, we can just write the partially applied function(i.e. {case ...}) and don't need to add (t1, t2) match before it.
However, it's not a big deal, using method is OK, but we should make it consistent with existing code.
|
ping... |
|
LGTM, but we need a more general rule to handle these promote-string stuff... |
|
ok to test |
|
Test build #34894 has finished for PR 6551 at commit
|
|
Thanks! Merging to master. |
… in non-binary expression of HiveTypeCoercion 1. Given a query `select coalesce(null, 1, '1') from dual` will cause exception: java.lang.RuntimeException: Could not determine return type of Coalesce for IntegerType,StringType 2. Given a query: `select case when true then 1 else '1' end from dual` will cause exception: java.lang.RuntimeException: Types in CASE WHEN must be the same or coercible to a common type: StringType != IntegerType I checked the code, the main cause is the HiveTypeCoercion doesn't do implicit convert when there is a IntegerType and StringType. Numeric types can be promoted to string type Hive will always do this implicit conversion. Author: OopsOutOfMemory <[email protected]> Closes apache#6551 from OopsOutOfMemory/pnts and squashes the following commits: 7a209d7 [OopsOutOfMemory] rebase master 6018613 [OopsOutOfMemory] convert function to method 4cd5618 [OopsOutOfMemory] limit the data type to primitive type df365d2 [OopsOutOfMemory] refine 95cbd58 [OopsOutOfMemory] fix style 403809c [OopsOutOfMemory] promote non-string to string when can not found tighestCommonTypeOfTwo
select coalesce(null, 1, '1') from dualwill cause exception:java.lang.RuntimeException: Could not determine return type of Coalesce for IntegerType,StringType
select case when true then 1 else '1' end from dualwill cause exception:java.lang.RuntimeException: Types in CASE WHEN must be the same or coercible to a common type: StringType != IntegerType
I checked the code, the main cause is the HiveTypeCoercion doesn't do implicit convert when there is a IntegerType and StringType.
Numeric types can be promoted to string type
Hive will always do this implicit conversion.