Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Dec 29, 2015

See https://issues.apache.org/jira/browse/SPARK-12560

This isn't causing any problems currently because the tests for string predicate pushdown are currently disabled. I ran into this while trying to turn them back on with a different version of parquet. Figure it was good to fix now in any case.

@SparkQA
Copy link

SparkQA commented Dec 29, 2015

Test build #48422 has finished for PR 10510 at commit cb4d534.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 29, 2015

Test build #48423 has finished for PR 10510 at commit 308294a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Dec 30, 2015

maybe @HyukjinKwon @davies @liancheng would be good reviewers here? thanks

@davies
Copy link
Contributor

davies commented Dec 30, 2015

@squito I still can't understand the problem, If it need copy the row somewhere, why not just do the copy? We are moving to use UnsafeRow everywhere.

@HyukjinKwon
Copy link
Member

@davies @squito The purpose of stripSparkFilter is to strip the wrapped Spark-side filter. I am not too sure if it is right to modify the results of stripped plans although I understand and agree that we might still need to execute such logics regardless of the inside or the outside of stripSparkFilter.

FYI, this function would be only used in OrcQuerySuite at the end. The reason why I added the function is we need to test the pushed filters correctly until unhandledFilter interface is implemented for internal datasources (JIRA).
For Parquet and JDBC, this function would not be used anymore. Here are the PR for Parquet, #10502, and for JDBC, #10427. They were discussed in this PR, #10221.

@squito
Copy link
Contributor Author

squito commented Jan 4, 2016

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Jan 4, 2016

Test build #48687 has finished for PR 10510 at commit bfbb4fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Jan 5, 2016

@davies that makes sense, I've updated the change to just add a .copy(), thanks for the suggestion.

@HyukjinKwon I think I understand the point of stripSparkFilter that its just for testing: you want to make sure that the filters get pushed down, even if spark woudl normally repeat the filter on its own anyway. Even if you are changing things so that spark often won't do the filter on its side, seems like we should still fix this util even if its just used by ORC. That extra copy is needed before the row.toSeq.

@HyukjinKwon
Copy link
Member

@squito Ah, sorry I misunderstood.

@liancheng
Copy link
Contributor

Thanks for fixing this, LGTM. Merging to master.

@asfgit asfgit closed this in 4dbd316 Jan 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants