Skip to content

Conversation

@gengliangwang
Copy link
Member

@gengliangwang gengliangwang commented Aug 17, 2021

What changes were proposed in this pull request?

Revert [SPARK-35028][SQL] ANSI mode: disallow group by aliases

Why are the changes needed?

It turns out that many users are using the group by alias feature. Spark has its precedence rule when alias names conflict with column names in Group by clause: always use the table column. This should be reasonable and acceptable.
Also, external DBMS such as PostgreSQL and MySQL allow grouping by alias, too.

As we are going to announce ANSI mode GA in Spark 3.2, I suggest allowing the group by alias in ANSI mode.

Does this PR introduce any user-facing change?

No, the feature is not released yet.

How was this patch tested?

Unit tests

@@ -1 +0,0 @@
--IMPORT group-analytics.sql No newline at end of file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to remove the result file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed

@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47034/

@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47034/

@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47049/

@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Test build #142533 has finished for PR 33758 at commit 76c697e.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47049/

@gengliangwang
Copy link
Member Author

Merging to master/3.2

gengliangwang added a commit that referenced this pull request Aug 17, 2021
### What changes were proposed in this pull request?

Revert [[SPARK-35028][SQL] ANSI mode: disallow group by aliases ](#32129)

### Why are the changes needed?

It turns out that many users are using the group by alias feature.  Spark has its precedence rule when alias names conflict with column names in Group by clause: always use the table column. This should be reasonable and acceptable.
Also, external DBMS such as PostgreSQL and MySQL allow grouping by alias, too.

As we are going to announce ANSI mode GA in Spark 3.2, I suggest allowing the group by alias in ANSI mode.

### Does this PR introduce _any_ user-facing change?

No, the feature is not released yet.

### How was this patch tested?

Unit tests

Closes #33758 from gengliangwang/revertGroupByAlias.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
(cherry picked from commit 8bfb4f1)
Signed-off-by: Gengliang Wang <[email protected]>
@SparkQA
Copy link

SparkQA commented Aug 17, 2021

Test build #142547 has finished for PR 33758 at commit 8e0aec3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

+1, LGTM.

xuanyuanking pushed a commit to xuanyuanking/spark that referenced this pull request Sep 29, 2021
### What changes were proposed in this pull request?

Revert [[SPARK-35028][SQL] ANSI mode: disallow group by aliases ](apache#32129)

### Why are the changes needed?

It turns out that many users are using the group by alias feature.  Spark has its precedence rule when alias names conflict with column names in Group by clause: always use the table column. This should be reasonable and acceptable.
Also, external DBMS such as PostgreSQL and MySQL allow grouping by alias, too.

As we are going to announce ANSI mode GA in Spark 3.2, I suggest allowing the group by alias in ANSI mode.

### Does this PR introduce _any_ user-facing change?

No, the feature is not released yet.

### How was this patch tested?

Unit tests

Closes apache#33758 from gengliangwang/revertGroupByAlias.

Authored-by: Gengliang Wang <[email protected]>
Signed-off-by: Gengliang Wang <[email protected]>
(cherry picked from commit 8bfb4f1)
Signed-off-by: Gengliang Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants