-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-33618][CORE] Use hadoop-client instead of hadoop-client-api to make hadoop-aws work #30508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @sunchao |
This comment has been minimized.
This comment has been minimized.
|
This is for testing the feasibility as one of the option. |
|
I am good with reverting this first. I will take a look separately for SPARK-33104. Presumably the tests will fail with Hadoop 2. |
|
@dongjoon-hyun do you mind fixing the PR title and description to contain SPARK-33104 and 10bd42c? |
|
Yes I'm fine for reverting this first while we searching for other solutions. Let's hope we can still ship this in Spark 3.1 release. |
|
Thank you, @HyukjinKwon and @sunchao . BTW, I'll update the PR title and description. |
hadoop-aws for Hadoop 3.x
This comment has been minimized.
This comment has been minimized.
hadoop-aws for Hadoop 3.x
This comment has been minimized.
This comment has been minimized.
…ofile" This reverts commit cb3fa6c. (cherry picked from commit a7dc7f92a392328bcbc95800f09d467a89d18dfe) Signed-off-by: Dongjoon Hyun <[email protected]>
|
Hi, All. |
|
Test build #132008 has finished for PR 30508 at commit
|
|
Test build #132012 has finished for PR 30508 at commit
|
|
Hi, @HyukjinKwon . |
|
Also, cc @viirya , @dbtsai , @sunchao , @srowen , @AngersZhuuuu , @mridulm , @tgravescs . |
viirya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay as I compared with SPARK-33212 (cb3fa6c).
HyukjinKwon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already reviewed this actually. Was wondering which one you guys prefer. LGTM
|
Merged to master. |
| hadoop-annotations/3.2.0//hadoop-annotations-3.2.0.jar | ||
| hadoop-auth/3.2.0//hadoop-auth-3.2.0.jar | ||
| hadoop-client/3.2.0//hadoop-client-3.2.0.jar | ||
| hadoop-common/3.2.0//hadoop-common-3.2.0.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pulls a ton more code into Spark now, like the whole client... hm, is this going to affect the hadoop-provided distro? it also downgrades some versions above which may be harmless. We really need this just for hadoop-aws?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, @srowen this is basically a revert. There was an issue found of shading hadoop client so it was reverted here as a safe choice. A proper fix is in progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK, nevermind. I am not following closely.
|
Thank you, @viirya , @HyukjinKwon and @srowen ! |
| kerb-crypto/1.0.1//kerb-crypto-1.0.1.jar | ||
| kerb-identity/1.0.1//kerb-identity-1.0.1.jar | ||
| kerb-server/1.0.1//kerb-server-1.0.1.jar | ||
| kerb-simplekdc/1.0.1//kerb-simplekdc-1.0.1.jar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for curiosity, does spark has a chance to play the role of KDC at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually a revert. It was added in ce7ba2e#diff-e45e1eee8dcfd7eaf8a013cec02b67806da3edeabe0f195ac6b4402f67d4b6dcR146
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the original PR does not handle any transitive artifact exclusion at all 😸
What changes were proposed in this pull request?
This reverts commit SPARK-33212 (cb3fa6c) mostly with three exceptions:
SparkSubmitUtilswas updated recently by SPARK-33580resource-managers/yarn/pom.xmlwas updated recently by SPARK-33104 to addhadoop-yarn-server-resourcemanagertest dependency.com.fasterxml.jackson.module:jackson-module-jaxb-annotationsdependency in K8s module which is updated recently by SPARK-33471.Why are the changes needed?
According to HADOOP-16080 since Apache Hadoop 3.1.1,
hadoop-awsdoesn't work withhadoop-client-api. It fails at write operation like the following.1. Spark distribution with
-Phadoop-cloud2. Spark distribution without
-Phadoop-cloudDoes this PR introduce any user-facing change?
No.
How was this patch tested?
Pass the CI.