-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-19509][SQL] Grouping Sets do not respect nullable grouping columns #16873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
also cc @stanzhai |
|
Test build #72645 has finished for PR 16873 at commit
|
|
|
||
|
|
||
| DROP VIEW IF EXISTS grouping; | ||
| DROP VIEW IF EXISTS grouping_null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Should we leave an extra empty line for the end of this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah lemme fix that.
|
Thank you for ccing me @hvanhovell ! This PR looks good to me. |
|
Test build #72653 has finished for PR 16873 at commit
|
|
I am merging this. |
…umns ## What changes were proposed in this pull request? The analyzer currently does not check if a column used in grouping sets is actually nullable itself. This can cause the nullability of the column to be incorrect, which can cause null pointer exceptions down the line. This PR fixes that by also consider the nullability of the column. This is only a problem for Spark 2.1 and below. The latest master uses a different approach. Closes #16874 ## How was this patch tested? Added a regression test to `SQLQueryTestSuite.grouping_set`. Author: Herman van Hovell <[email protected]> Closes #16873 from hvanhovell/SPARK-19509.
…umns ## What changes were proposed in this pull request? The analyzer currently does not check if a column used in grouping sets is actually nullable itself. This can cause the nullability of the column to be incorrect, which can cause null pointer exceptions down the line. This PR fixes that by also consider the nullability of the column. This is only a problem for Spark 2.1 and below. The latest master uses a different approach. Closes #16874 ## How was this patch tested? Added a regression test to `SQLQueryTestSuite.grouping_set`. Author: Herman van Hovell <[email protected]> Closes #16873 from hvanhovell/SPARK-19509. (cherry picked from commit a3d5300) Signed-off-by: Herman van Hovell <[email protected]>
What changes were proposed in this pull request?
The analyzer currently does not check if a column used in grouping sets is actually nullable itself. This can cause the nullability of the column to be incorrect, which can cause null pointer exceptions down the line. This PR fixes that by also consider the nullability of the column.
This is only a problem for Spark 2.1 and below. The latest master uses a different approach.
Closes #16874
How was this patch tested?
Added a regression test to
SQLQueryTestSuite.grouping_set.