Skip to content

Conversation

@HyukjinKwon
Copy link
Member

Currently Parquet predicate tests all pass even if filters are not pushed down or this is disabled.

In this PR, For checking evaluating filters, Simply it makes the expression from expression.Filter and then try to create filters just like Spark does.

For checking the results, this manually accesses to the child rdd (of expression.Filter) and produces the results which should be filtered properly, and then compares it to expected values.

Now, if filters are not pushed down or this is disabled, this throws exceptions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is redundant since we've already specified ParquetRelation in the pattern match.

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45727 has finished for PR 9659 at commit 8cc842f.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Nov 12, 2015

Test build #45730 has finished for PR 9659 at commit 8cc842f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 13, 2015

Test build #45796 has finished for PR 9659 at commit 02fa055.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

In this commit, I resolved conflicts, renamed the function extractSourceRDDToDataFrame to stripSparkFilter and set false to spark.sql.parquet.enableUnsafeRowRecordReader.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Dec 9, 2015

Test build #47391 has finished for PR 9659 at commit 9861d96.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 9, 2015

Test build #47407 has finished for PR 9659 at commit 372c99e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Actually you can specify multiple confs within a single withSQLConf call.

But I think it's OK to leave it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks!

@liancheng
Copy link
Contributor

Thanks for working on this! Merging to master. (Actually I thought I've already merged this one.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants