Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 29, 2021

What changes were proposed in this pull request?

This PR aims to rename withAllOrcReaders to withAllNativeOrcReaders.

Why are the changes needed?

Apache Spark have non-native ORC reader too. In SQL module, we only test native readers.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

@github-actions github-actions bot added the SQL label Nov 29, 2021
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-36346][SQL][FOLLOWUP] Rename withAllOrcReaders to withAllNativeOrcReaders [SPARK-36346][SQL][FOLLOWUP] Rename withAllOrcReaders to withAllNativeOrcReaders Nov 29, 2021
@SparkQA
Copy link

SparkQA commented Nov 29, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50168/

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we test Hive's implementation under withAllOrcReaders as well? If yes, can we just extend withAllOrcReaders in that case? Also I see that spark.sql.orc.enableVectorizedReader is set in Hive's tests, so, withAllNativeOrcReaders can be used in such tests which is confusing slightly.

@SparkQA
Copy link

SparkQA commented Nov 29, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50168/

@cloud-fan
Copy link
Contributor

cloud-fan commented Nov 29, 2021

Can we test Hive's implementation under withAllOrcReaders as well?

That will be a big change and we need to move many ORC tests to sql/hive. Another idea is to remove (or at least deprecate) the hive ORC reader as the native orc reader has been used by default since Spark 2.4.

@SparkQA
Copy link

SparkQA commented Nov 29, 2021

Test build #145698 has finished for PR 34733 at commit fc448fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

To @MaxGekk , as @cloud-fan mentioned, it's a big change and not a good approach because SQL module had batter be complete by itself.

To @cloud-fan , yes, the deprecation is possible, but Hive ORC reader has still been used by some users. Do we officially resolve all old issues with native readers? I'm still not sure about that.

@dongjoon-hyun
Copy link
Member Author

Thank you all! Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants