-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-24790][SQL] Allow complex aggregate expressions in Pivot #21753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,5 @@ | ||
| -- Automatically generated by SQLQueryTestSuite | ||
| -- Number of queries: 13 | ||
| -- Number of queries: 15 | ||
|
|
||
|
|
||
| -- !query 0 | ||
|
|
@@ -176,7 +176,7 @@ PIVOT ( | |
| struct<> | ||
| -- !query 11 output | ||
| org.apache.spark.sql.AnalysisException | ||
| Aggregate expression required for pivot, found 'abs(earnings#x)'; | ||
| Aggregate expression required for pivot, but 'coursesales.`earnings`' did not appear in any aggregate function.; | ||
|
|
||
|
|
||
| -- !query 12 | ||
|
|
@@ -192,3 +192,33 @@ struct<> | |
| -- !query 12 output | ||
| org.apache.spark.sql.AnalysisException | ||
| cannot resolve '`year`' given input columns: [__auto_generated_subquery_name.course, __auto_generated_subquery_name.earnings]; line 4 pos 0 | ||
|
|
||
|
|
||
| -- !query 13 | ||
| SELECT * FROM ( | ||
| SELECT year, course, earnings FROM courseSales | ||
| ) | ||
| PIVOT ( | ||
| ceil(sum(earnings)), avg(earnings) + 1 as a1 | ||
| FOR course IN ('dotNET', 'Java') | ||
| ) | ||
| -- !query 13 schema | ||
| struct<year:int,dotNET_CEIL(sum(CAST(earnings AS BIGINT))):bigint,dotNET_a1:double,Java_CEIL(sum(CAST(earnings AS BIGINT))):bigint,Java_a1:double> | ||
| -- !query 13 output | ||
| 2012 15000 7501.0 20000 20001.0 | ||
| 2013 48000 48001.0 30000 30001.0 | ||
|
|
||
|
|
||
| -- !query 14 | ||
| SELECT * FROM ( | ||
| SELECT year, course, earnings FROM courseSales | ||
| ) | ||
| PIVOT ( | ||
| sum(avg(earnings)) | ||
| FOR course IN ('dotNET', 'Java') | ||
| ) | ||
| -- !query 14 schema | ||
| struct<> | ||
| -- !query 14 output | ||
| org.apache.spark.sql.AnalysisException | ||
| It is not allowed to use an aggregate function in the argument of another aggregate function. Please use the inner aggregate function in a sub-query.; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test is related to this pr? I think the output does not change with/without this pr.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right. I think it's still worth adding such a test for pivot.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adding this test is just to improve the test coverage. It looks reasonable.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The general principle in our Analyzer is do the error handling in |
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a JIRA for this support https://issues.apache.org/jira/browse/SPARK-24796