-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-10623] [SQL] Fixes ORC predicate push-down #8799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The in(attribute, _) part is wrong, because in accepts varargs instead of a list.
|
Test build #42613 has finished for PR 8799 at commit
|
|
Test build #42625 has finished for PR 8799 at commit
|
|
LGTM Thanks for fixing this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this In get changed toInSet by the optimizer when all values are literals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a sources.In rather than expressions.In, so it should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok.
|
two comments. Otherwise, looks good. |
|
Test build #42688 has finished for PR 8799 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this test check if the predicate is really pushded down? Or it just check answers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just checks answers. It's annoying that I couldn't find a programmatical way to verify whether its pushed down or not. Checked through logs manually though.
|
ok. Merging to master and branch 1.5. |
When pushing down a leaf predicate, ORC `SearchArgument` builder requires an extra "parent" predicate (any one among `AND`/`OR`/`NOT`) to wrap the leaf predicate. E.g., to push down `a < 1`, we must build `AND(a < 1)` instead. Fortunately, when actually constructing the `SearchArgument`, the builder will eliminate all those unnecessary wrappers. This PR is based on #8783 authored by zhzhan. I also took the chance to simply `OrcFilters` a little bit to improve readability. Author: Cheng Lian <[email protected]> Closes #8799 from liancheng/spark-10623/fix-orc-ppd. (cherry picked from commit 22be2ae) Signed-off-by: Yin Huai <[email protected]> Conflicts: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala
When pushing down a leaf predicate, ORC
SearchArgumentbuilder requires an extra "parent" predicate (any one amongAND/OR/NOT) to wrap the leaf predicate. E.g., to push downa < 1, we must buildAND(a < 1)instead. Fortunately, when actually constructing theSearchArgument, the builder will eliminate all those unnecessary wrappers.This PR is based on #8783 authored by @zhzhan. I also took the chance to simply
OrcFiltersa little bit to improve readability.