-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-31146][SQL] Leverage the helper method for aliasing in built-in SQL expressions #27901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Retest this please. |
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
Show resolved
Hide resolved
| select ln(1.2345678e-28) | ||
| -- !query schema | ||
| struct<LOG(1.2345678E-28):double> | ||
| struct<log(1.2345678E-28):double> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have ln instead of log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually realised that there are some more instances such as ToDegrees, ToRadians, UnaryMinus and UnaryPositive.. let me exclude these to make the scope my PR addresses smaller
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
|
Oh, R failure is relevant one. |
|
Looks nice. For easy trackability in commit logs, could you add a list of the functions (that this PR adds alias flags for) in the PR description? |
sql/core/src/test/resources/sql-tests/results/postgreSQL/float8.sql.out
Outdated
Show resolved
Hide resolved
|
cc @cloud-fan too - I realised that you touched lots of codes related to this during commit history search. |
This comment has been minimized.
This comment has been minimized.
|
retest this please |
|
Test build #119813 has finished for PR 27901 at commit
|
...alyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala
Show resolved
Hide resolved
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If users code refers the output column name, will such change break user code? Should we update migration guide too?
|
I don't think we have made guarantee on the output column name. Also, I would think this is as a rather bug fix. We have already made such changes a lot in the history. |
...alyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CallMethodViaReflection.scala
Show resolved
Hide resolved
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Thank you, @HyukjinKwon and all.
Merged to master/3.0.
…n SQL expressions ### What changes were proposed in this pull request? This PR is kind of a followup of #26808. It leverages the helper method for aliasing in built-in SQL expressions to use the alias as its output column name where it's applicable. - `Expression`, `UnaryMathExpression` and `BinaryMathExpression` search the alias in the tags by default. - When the naming is different in its implementation, it has to be overwritten for the expression specifically. E.g., `CallMethodViaReflection`, `Remainder`, `CurrentTimestamp`, `FormatString` and `XPathDouble`. This PR fixes the aliases of the functions below: | class | alias | |--------------------------|------------------| |`Rand` |`random` | |`Ceil` |`ceiling` | |`Remainder` |`mod` | |`Pow` |`pow` | |`Signum` |`sign` | |`Chr` |`char` | |`Length` |`char_length` | |`Length` |`character_length`| |`FormatString` |`printf` | |`Substring` |`substr` | |`Upper` |`ucase` | |`XPathDouble` |`xpath_number` | |`DayOfMonth` |`day` | |`CurrentTimestamp` |`now` | |`Size` |`cardinality` | |`Sha1` |`sha` | |`CallMethodViaReflection` |`java_method` | Note: `EqualTo`, `=` and `==` aliases were excluded because it's unable to leverage this helper method. It should fix the parser. Note: this PR also excludes some instances such as `ToDegrees`, `ToRadians`, `UnaryMinus` and `UnaryPositive` that needs an explicit name overwritten to make the scope of this PR smaller. ### Why are the changes needed? To respect expression name. ### Does this PR introduce any user-facing change? Yes, it will change the output column name. ### How was this patch tested? Manually tested, and unittests were added. Closes #27901 from HyukjinKwon/31146. Authored-by: HyukjinKwon <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 6704103) Signed-off-by: Dongjoon Hyun <[email protected]>
|
Thank you guys! |
|
We need to update the migration guide. This impacts the output schema of SQL queries. |
|
|
|
And I think we don't also guarantee on output columns names (#27901 (comment)). |
…n SQL expressions ### What changes were proposed in this pull request? This PR is kind of a followup of apache#26808. It leverages the helper method for aliasing in built-in SQL expressions to use the alias as its output column name where it's applicable. - `Expression`, `UnaryMathExpression` and `BinaryMathExpression` search the alias in the tags by default. - When the naming is different in its implementation, it has to be overwritten for the expression specifically. E.g., `CallMethodViaReflection`, `Remainder`, `CurrentTimestamp`, `FormatString` and `XPathDouble`. This PR fixes the aliases of the functions below: | class | alias | |--------------------------|------------------| |`Rand` |`random` | |`Ceil` |`ceiling` | |`Remainder` |`mod` | |`Pow` |`pow` | |`Signum` |`sign` | |`Chr` |`char` | |`Length` |`char_length` | |`Length` |`character_length`| |`FormatString` |`printf` | |`Substring` |`substr` | |`Upper` |`ucase` | |`XPathDouble` |`xpath_number` | |`DayOfMonth` |`day` | |`CurrentTimestamp` |`now` | |`Size` |`cardinality` | |`Sha1` |`sha` | |`CallMethodViaReflection` |`java_method` | Note: `EqualTo`, `=` and `==` aliases were excluded because it's unable to leverage this helper method. It should fix the parser. Note: this PR also excludes some instances such as `ToDegrees`, `ToRadians`, `UnaryMinus` and `UnaryPositive` that needs an explicit name overwritten to make the scope of this PR smaller. ### Why are the changes needed? To respect expression name. ### Does this PR introduce any user-facing change? Yes, it will change the output column name. ### How was this patch tested? Manually tested, and unittests were added. Closes apache#27901 from HyukjinKwon/31146. Authored-by: HyukjinKwon <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This PR is kind of a followup of #26808. It leverages the helper method for aliasing in built-in SQL expressions to use the alias as its output column name where it's applicable.
Expression,UnaryMathExpressionandBinaryMathExpressionsearch the alias in the tags by default.CallMethodViaReflection,Remainder,CurrentTimestamp,FormatStringandXPathDouble.This PR fixes the aliases of the functions below:
RandrandomCeilceilingRemaindermodPowpowSignumsignChrcharLengthchar_lengthLengthcharacter_lengthFormatStringprintfSubstringsubstrUpperucaseXPathDoublexpath_numberDayOfMonthdayCurrentTimestampnowSizecardinalitySha1shaCallMethodViaReflectionjava_methodNote:
EqualTo,=and==aliases were excluded because it's unable to leverage this helper method. It should fix the parser.Note: this PR also excludes some instances such as
ToDegrees,ToRadians,UnaryMinusandUnaryPositivethat needs an explicit name overwritten to make the scope of this PR smaller.Why are the changes needed?
To respect expression name.
Does this PR introduce any user-facing change?
Yes, it will change the output column name.
How was this patch tested?
Manually tested, and unittests were added.