-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-11676][SQL] Parquet filter tests all pass if filters are not really pushed down #9659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion is redundant since we've already specified ParquetRelation in the pattern match.
|
Test build #45727 has finished for PR 9659 at commit
|
|
retest this please |
|
Test build #45730 has finished for PR 9659 at commit
|
|
Test build #45796 has finished for PR 9659 at commit
|
|
In this commit, I resolved conflicts, renamed the function extractSourceRDDToDataFrame to stripSparkFilter and set false to |
|
retest this please |
|
Test build #47391 has finished for PR 9659 at commit
|
|
Test build #47407 has finished for PR 9659 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Actually you can specify multiple confs within a single withSQLConf call.
But I think it's OK to leave it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Thanks!
|
Thanks for working on this! Merging to master. (Actually I thought I've already merged this one.) |
Currently Parquet predicate tests all pass even if filters are not pushed down or this is disabled.
In this PR, For checking evaluating filters, Simply it makes the expression from
expression.Filterand then try to create filters just like Spark does.For checking the results, this manually accesses to the child rdd (of
expression.Filter) and produces the results which should be filtered properly, and then compares it to expected values.Now, if filters are not pushed down or this is disabled, this throws exceptions.