-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-9830] [SQL] Remove AggregateExpression1 and Aggregate Operator used to evaluate AggregateExpression1s #9556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…l remove this limitation.
Conflicts: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Utils.scala
…in DF's API), making all fields that may use children lazy vals.
|
YAY. |
|
test this please |
|
Test build #45342 has finished for PR 9556 at commit
|
|
retest this please |
|
Test build #45350 has finished for PR 9556 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment and the error message mention a new aggregation path. We should update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivaram Seems this default value should be 0.05 instead of 0.95, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah looks like it should be 0.05 -- cc @davies
This seems to have been added back when SparkR was a separate code-base in davies/SparkR-pkg@d7b17a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
|
test this please |
|
Test build #45474 has finished for PR 9556 at commit
|
|
Test build #2027 has finished for PR 9556 at commit
|
|
Test build #45476 has finished for PR 9556 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Test build #45492 has finished for PR 9556 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agg2 -> agg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
We should keep reviewing and address any comments in a follow-up, but I'm going to merge this now to unblock other work. Thanks! |
…used to evaluate AggregateExpression1s https://issues.apache.org/jira/browse/SPARK-9830 This PR contains the following main changes. * Removing `AggregateExpression1`. * Removing `Aggregate` operator, which is used to evaluate `AggregateExpression1`. * Removing planner rule used to plan `Aggregate`. * Linking `MultipleDistinctRewriter` to analyzer. * Renaming `AggregateExpression2` to `AggregateExpression` and `AggregateFunction2` to `AggregateFunction`. * Updating places where we create aggregate expression. The way to create aggregate expressions is `AggregateExpression(aggregateFunction, mode, isDistinct)`. * Changing `val`s in `DeclarativeAggregate`s that touch children of this function to `lazy val`s (when we create aggregate expression in DataFrame API, children of an aggregate function can be unresolved). Author: Yin Huai <[email protected]> Closes #9556 from yhuai/removeAgg1. (cherry picked from commit e0701c7) Signed-off-by: Michael Armbrust <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove the NullType here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we should remove it. I somehow missed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
… and update toString of Exchange https://issues.apache.org/jira/browse/SPARK-9830 This is the follow-up pr for #9556 to address davies' comments. Author: Yin Huai <[email protected]> Closes #9607 from yhuai/removeAgg1-followup. (cherry picked from commit 3121e78) Signed-off-by: Reynold Xin <[email protected]>
… and update toString of Exchange https://issues.apache.org/jira/browse/SPARK-9830 This is the follow-up pr for #9556 to address davies' comments. Author: Yin Huai <[email protected]> Closes #9607 from yhuai/removeAgg1-followup.
… and update toString of Exchange https://issues.apache.org/jira/browse/SPARK-9830 This is the follow-up pr for apache/spark#9556 to address davies' comments. Author: Yin Huai <[email protected]> Closes #9607 from yhuai/removeAgg1-followup.
https://issues.apache.org/jira/browse/SPARK-9830
This PR contains the following main changes.
AggregateExpression1.Aggregateoperator, which is used to evaluateAggregateExpression1.Aggregate.MultipleDistinctRewriterto analyzer.AggregateExpression2toAggregateExpressionandAggregateFunction2toAggregateFunction.AggregateExpression(aggregateFunction, mode, isDistinct).vals inDeclarativeAggregates that touch children of this function tolazy vals (when we create aggregate expression in DataFrame API, children of an aggregate function can be unresolved).