-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-27604][SQL] Enhance constant propagation #24553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27604][SQL] Enhance constant propagation #24553
Conversation
|
This PR is extracted from #24495 as @dongjoon-hyun suggested to make review easier. |
mgaido91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peter-toth may you please also add some UTs? eg. the example you added in the scaladoc may be also formalized with a UT... thanks.
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Show resolved
Hide resolved
|
@mgaido91, I've moved the relevant UTs from the other PR here. |
|
ok to test |
1 similar comment
|
ok to test |
|
Test build #105274 has finished for PR 24553 at commit
|
|
retest this please |
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
|
Test build #105275 has finished for PR 24553 at commit
|
|
Test build #105278 has finished for PR 24553 at commit
|
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
Outdated
Show resolved
Hide resolved
|
Test build #105283 has finished for PR 24553 at commit
|
|
Any comments or suggestions @dongjoon-hyun or @gatorsmile ? |
|
@dongjoon-hyun @gatorsmile @mgaido91 @viirya @HyukjinKwon any feedback or suggestion is very welcome. |
mgaido91
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the change makes sense in general, just some questions on the tests. Thanks @peter-toth
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
|
Thanks so much @mgaido91 for the review. Please let me know if you have any more questions. |
|
Test build #105769 has finished for PR 24553 at commit
|
...talyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantPropagationSuite.scala
Outdated
Show resolved
Hide resolved
...core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala
Show resolved
Hide resolved
|
Test build #105780 has finished for PR 24553 at commit
|
|
LGTM, thanks! |
|
@dongjoon-hyun @gatorsmile @cloud-fan @viirya @HyukjinKwon could you please help me to review this PR? |
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a cursory look, seems making sense but I guess @hvanhovell should take a look as well.
|
retest this please |
|
Test build #106493 has finished for PR 24553 at commit
|
827cfe5 to
054624e
Compare
|
Test build #106794 has finished for PR 24553 at commit
|
|
retest this please |
|
retest this please |
|
Test build #116717 has finished for PR 24553 at commit
|
|
Test build #116721 has finished for PR 24553 at commit
|
39c563d to
85d5430
Compare
|
Test build #116792 has finished for PR 24553 at commit
|
|
Test build #116861 has finished for PR 24553 at commit
|
|
Test build #116866 has finished for PR 24553 at commit
|
|
Test build #116980 has finished for PR 24553 at commit
|
4f0a89a to
5d67ffd
Compare
|
Test build #117354 has finished for PR 24553 at commit
|
5d67ffd to
767233e
Compare
|
Test build #117514 has finished for PR 24553 at commit
|
767233e to
6c33c71
Compare
|
Test build #117661 has finished for PR 24553 at commit
|
6c33c71 to
63ea687
Compare
|
Test build #117663 has finished for PR 24553 at commit
|
|
@cloud-fan, @dongjoon-hyun, @maropu I improved |
|
Test build #118355 has finished for PR 24553 at commit
|
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
This PR improves
ConstantPropagationrule, the new implementation:e.g.
a = 1 AND a = 2=>a = 1 AND 1 = 2Before this PR this unsatisfiable filter was not reduced:
e.g.
abs(a) = 5 AND b = abs(a) + 3=>abs(a) = 5 AND b = 8EqualToandEqualNullSafeexpressions:e.g.
a = 5 AND b < a + 3=>a = 5 AND b < 8Projectand other nodes (except forJoin):e.g.
SELECT a = 5 AND b = a + 3=>SELECT a = 5 AND b = 8Before this PR only
Filterwas supported.e.g.
... WHERE IF(..., a = 5 AND b = a + 3, ...)=>... WHERE IF(..., a = 5 AND b = 8, ...)Before this PR only top level
And/Or/Notnodes were supported.... WHERE a = c AND f(a)orIF(a = c AND f(a), ..., ...)whereais a nullable expression andcis a constant thenullresult ofa = c AND f(a)means the same as if it resultedfalsetherefore constant propagation can be safely applied (a = c AND f(a)=>a = c AND f(c)).SELECT a = c AND f(a)thenullresult really meansnull. In this context constant propagation can't be applied safely.Notin which the context flips. E.g. constant propagation can't be applied onWHERE NOT(a = c AND f(a))but can be again onWHERE NOT(IF(..., NOT(a = c AND f(a)), ...).Why are the changes needed?
Improve performance.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
New UTs.