-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-33122][SQL][FOLLOWUP] Extend RemoveRedundantAggregates optimizer rule to apply to more cases #31914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #136308 has finished for PR 31914 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @tanelk .
Since the commit title is precious resource, please don't repeat the original JIRA title. The JIRA ID is enough for that purpose. It would be great if you give a more meaningful and specific PR title.
|
@maropu , this is a followup to a PR you reviewed a while back, but it has gone unnoticed. |
| val upperHasNoDuplicateSensitiveAgg = upper | ||
| .aggregateExpressions | ||
| .forall(expr => expr.find { | ||
| case ae: AggregateExpression => !EliminateDistinct.isDuplicateAgnostic(ae.aggregateFunction) | ||
| case e => AggregateExpression.isAggregate(e) | ||
| }.isEmpty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only behaviour change
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #138873 has finished for PR 31914 at commit
|
|
I've checked the GA tests passed, so I will merge this. Thank you, @tanelk |
|
Merged to master. |
What changes were proposed in this pull request?
Addressed the @dongjoon-hyun comments on the previous PR #30018.
Extended the
RemoveRedundantAggregatesrule to remove redundant aggregations in even more queries. For example inthe
dropDuplicatesis not needed, because the result onmaxdoes not depend on duplicate values.Why are the changes needed?
Improve performance.
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT