-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-32646][SQL][test-hadoop2.7][test-hive1.2] ORC predicate pushdown should work with case-insensitive analysis #29530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
retest this please |
|
Test build #127848 has finished for PR 29530 at commit
|
|
retest this please |
|
Test build #127847 has finished for PR 29530 at commit
|
|
Test build #127855 has finished for PR 29530 at commit
|
|
OK. Passed tests of both hive-2.3 and hive-1.2 profiles. This is the same diff as previous #29457, but fixed a hive-1.2 compilation error. cc @cloud-fan @HyukjinKwon @dongjoon-hyun |
sql/core/v1.2/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
Show resolved
Hide resolved
sql/core/v2.3/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
Outdated
Show resolved
Hide resolved
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine as it was already reviewed, and the compilation issue looks minor.
|
I am going to merge this. The last commit is just about accessor and explicit return type which should be caught during the build. |
|
Merged to master. |
|
Thanks @HyukjinKwon |
|
Test build #127867 has finished for PR 29530 at commit
|
What changes were proposed in this pull request?
This PR proposes to fix ORC predicate pushdown under case-insensitive analysis case. The field names in pushed down predicates don't need to match in exact letter case with physical field names in ORC files, if we enable case-insensitive analysis.
This is re-submitted for #29457. Because #29457 has a hive-1.2 error and there were some tests failed with hive-1.2 profile at the same time, #29457 was reverted to unblock others.
Why are the changes needed?
Currently ORC predicate pushdown doesn't work with case-insensitive analysis. A predicate "a < 0" cannot pushdown to ORC file with field name "A" under case-insensitive analysis.
But Parquet predicate pushdown works with this case. We should make ORC predicate pushdown work with case-insensitive analysis too.
Does this PR introduce any user-facing change?
Yes, after this PR, under case-insensitive analysis, ORC predicate pushdown will work.
How was this patch tested?
Unit tests.