[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 #42345

dongjoon-hyun · 2023-08-04T16:09:48Z

What changes were proposed in this pull request?

This PR aims to downgrade the Apache Hadoop dependency to 3.3.4 in Apache Spark 3.5 in order to prevent any regression from Apache Spark 3.4.x. In other words, although Apache Spark 3.5.x will lose many bug fixes of Apache Hadoop 3.3.5 and 3.3.6, it will be in the same situation with Apache Spark 3.4.x.

SPARK-44197 Upgrade Hadoop to 3.3.6 ([SPARK-44197][BUILD] Upgrade Hadoop to 3.3.6 #41744)
SPARK-42913 Upgrade Hadoop to 3.3.5 ([SPARK-42913][BUILD] Upgrade Hadoop to 3.3.5 #39124)
SPARK-43448 Remove dummy dependency hadoop-openstack ([SPARK-43448][BUILD] Remove dummy dependency hadoop-openstack #41133)

On top of reverting SPARK-44197 and SPARK-42913, this PR has additional dependency exclusion change due to the following.

SPARK-43880 Organize hadoop-cloud in standard maven project structure ([SPARK-43880][BUILD] Organize hadoop-cloud in standard maven project structure #41380)

Why are the changes needed?

There is a community report on S3A committer performance regression. Although it's one liner fix, there is no available Hadoop release with that fix at this time.

HADOOP-18757: Bump corePoolSize of HadoopThreadPoolExecutor in s3a committer (HADOOP-18757: Bump corePoolSize of HadoopThreadPoolExecutor in s3a committer hadoop#5706)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

dongjoon-hyun · 2023-08-04T16:27:05Z

Thank you, @pan3793 .

Also, cc @LuciferYang , @sunchao , @viirya

sunchao

LGTM

viirya · 2023-08-04T18:29:23Z

Looks good to me.

dongjoon-hyun · 2023-08-04T21:20:32Z

Thank you, @sunchao , @viirya , @pan3793 .
Merged to branch-3.4 for Apache Spark 3.5.0 RC2.

### What changes were proposed in this pull request? This PR aims to downgrade the Apache Hadoop dependency to 3.3.4 in `Apache Spark 3.5` in order to prevent any regression from `Apache Spark 3.4.x`. In other words, although `Apache Spark 3.5.x` will lose many bug fixes of Apache Hadoop 3.3.5 and 3.3.6, it will be in the same situation with `Apache Spark 3.4.x`. - SPARK-44197 Upgrade Hadoop to 3.3.6 (#41744) - SPARK-42913 Upgrade Hadoop to 3.3.5 (#39124) - SPARK-43448 Remove dummy dependency `hadoop-openstack` (#41133) On top of reverting SPARK-44197 and SPARK-42913, this PR has additional dependency exclusion change due to the following. - SPARK-43880 Organize `hadoop-cloud` in standard maven project structure (#41380) ### Why are the changes needed? There is a community report on S3A committer performance regression. Although it's one liner fix, there is no available Hadoop release with that fix at this time. - HADOOP-18757: Bump corePoolSize of HadoopThreadPoolExecutor in s3a committer (apache/hadoop#5706) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #42345 from dongjoon-hyun/SPARK-44678. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

LuciferYang · 2023-08-05T00:11:31Z

late LGTM

[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4

6e35239

github-actions bot added SQL BUILD labels Aug 4, 2023

pan3793 approved these changes Aug 4, 2023

View reviewed changes

sunchao approved these changes Aug 4, 2023

View reviewed changes

viirya approved these changes Aug 4, 2023

View reviewed changes

dongjoon-hyun closed this Aug 4, 2023

dongjoon-hyun deleted the SPARK-44678 branch August 4, 2023 21:21

dongjoon-hyun mentioned this pull request Mar 21, 2024

[SPARK-47457][SQL] Backport a6bffcc3e5f0a190b5b7f5c30808b604acc30607 #45645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 #42345

[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 #42345

Uh oh!

dongjoon-hyun commented Aug 4, 2023 •

edited

Loading

Uh oh!

dongjoon-hyun commented Aug 4, 2023

Uh oh!

sunchao left a comment

Uh oh!

viirya commented Aug 4, 2023

Uh oh!

dongjoon-hyun commented Aug 4, 2023

Uh oh!

LuciferYang commented Aug 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 #42345

[SPARK-44678][BUILD][3.5] Downgrade Hadoop to 3.3.4 #42345

Uh oh!

Conversation

dongjoon-hyun commented Aug 4, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

dongjoon-hyun commented Aug 4, 2023

Uh oh!

sunchao left a comment

Choose a reason for hiding this comment

Uh oh!

viirya commented Aug 4, 2023

Uh oh!

dongjoon-hyun commented Aug 4, 2023

Uh oh!

LuciferYang commented Aug 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dongjoon-hyun commented Aug 4, 2023 •

edited

Loading