Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Jun 26, 2023

What changes were proposed in this pull request?

This PR aims to upgrade Apache Hadoop dependency to 3.3.6.

Why are the changes needed?

To bring the latest bug fixes for Apache Spark 3.5.0 and use the artifacts with SBOM

It contains 117 bug fixes, improvements and enhancements since 3.3.5.

Does this PR introduce any user-facing change?

This is a dependency change.

How was this patch tested?

Pass the CIs.

@github-actions github-actions bot added the BUILD label Jun 26, 2023
@dongjoon-hyun
Copy link
Member Author

All tests passed.

Screenshot 2023-06-26 at 3 42 39 PM

@dongjoon-hyun
Copy link
Member Author

Could you review this PR, @sunchao ?

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (pending CI)

@dongjoon-hyun
Copy link
Member Author

Thank you, @sunchao . The CI passed already~
Let me merge this~ :)

@dongjoon-hyun dongjoon-hyun deleted the SPARK-44197 branch June 26, 2023 23:00
@LuciferYang
Copy link
Contributor

late LGTM

dongjoon-hyun added a commit that referenced this pull request Aug 4, 2023
### What changes were proposed in this pull request?

This PR aims to downgrade the Apache Hadoop dependency to 3.3.4 in `Apache Spark 3.5` in order to prevent any regression from `Apache Spark 3.4.x`. In other words, although `Apache Spark 3.5.x` will lose many bug fixes of Apache Hadoop 3.3.5 and 3.3.6, it will be in the same situation with `Apache Spark 3.4.x`.
- SPARK-44197 Upgrade Hadoop to 3.3.6 (#41744)
- SPARK-42913 Upgrade Hadoop to 3.3.5 (#39124)
- SPARK-43448 Remove dummy dependency `hadoop-openstack` (#41133)

On top of reverting SPARK-44197 and SPARK-42913, this PR has additional dependency exclusion change due to the following.
- SPARK-43880 Organize `hadoop-cloud` in standard maven project structure (#41380)

### Why are the changes needed?

There is a community report on S3A committer performance regression. Although it's one liner fix, there is no available Hadoop release with that fix at this time.
- HADOOP-18757: Bump corePoolSize of HadoopThreadPoolExecutor in s3a committer (apache/hadoop#5706)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #42345 from dongjoon-hyun/SPARK-44678.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants