-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-13957] [SQL] Support Group By Ordinal in SQL #11846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #53723 has finished for PR 11846 at commit
|
# Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
| val newGroups = groups.map { | ||
| case IntegerIndex(index) if index > 0 && index <= aggs.size => | ||
| aggs(index - 1) match { | ||
| case Alias(c, _) if c.isInstanceOf[AggregateExpression] => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about sum(a) + 1? I think we need to use TreeNode.find to check if there are any agg functions inside it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a method called cotainsAggregate somewhere, we should call it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh, yeah! let me fix it and add a test case. Thanks!
|
Test build #53737 has finished for PR 11846 at commit
|
|
Test build #53746 has finished for PR 11846 at commit
|
| } else { | ||
| val expanded = a.aggregateExpressions.flatMap { | ||
| case s: Star => s.expand(a.child, resolver) | ||
| case u @ UnresolvedAlias(_: Star, _) => expandStarExpression(u.child, a.child) :: Nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when will we hit this branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
select * from tab group by col1, col2There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But why doesn't Project have this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is intentionally added by CatalystQl. I can double check if this is the root cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the code, Project still has a problem in star expansion:
val structDf = testData2.select("a", "b").as("record")
structDf.select(hash($"record.*"))Sorry, the previous PR does not cover all the cases. Let me submit a separate PR to handle all the star expansion.
Of course, if we want to limit the support of star expansion in group by, we can do it for sure.
|
Test build #53876 has finished for PR 11846 at commit
|
# Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
|
Test build #54059 has finished for PR 11846 at commit
|
|
retest this please. |
|
Test build #54098 has finished for PR 11846 at commit
|
| // which is a 1-base position of the projection list. | ||
| case s @ Sort(orders, global, child) | ||
| if conf.orderByOrdinal && orders.exists(o => IntegerIndex.unapply(o.child).nonEmpty) => | ||
| if conf.orderByOrdinal && child.resolved && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add a case plan if !plan.childrenResolved => plan at the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let me do it. Thanks!
Will use p instead of plan since plan causes a warning by IntelliJ compiler for possible shadowing.
|
LGTM except one minor comment, thanks for working on it! |
|
Thank you for your detailed review! :-) |
|
retest this please |
|
Test build #54138 has finished for PR 11846 at commit
|
|
Thanks, merging to master! |
What changes were proposed in this pull request?
This PR is to support group by position in SQL. For example, when users input the following query
The ordinals are recognized as the positions in the select list. Thus,
Analyzerconverts it toThis is controlled by the config option
spark.sql.groupByOrdinal.Note: This PR is taken from #10731. When merging this PR, please give the credit to @zhichao-li
Also cc all the people who are involved in the previous discussion: @rxin @cloud-fan @marmbrus @yhuai @hvanhovell @adrian-wang @chenghao-intel @tejasapatil
How was this patch tested?
Added a few test cases for both positive and negative test cases.