-
Couldn't load subscription status.
- Fork 9.1k
HADOOP-18818. Merge aws v2 upgrade feature branch into trunk #5995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18818. Merge aws v2 upgrade feature branch into trunk #5995
Conversation
|
well, this is taking a while! |
|
💔 -1 overall
This message was automatically generated. |
|
tested s3 london with: -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch tests to 28 min; starting to get slow again with the prefetch parameterization. long term: get prefetching into the state where we can switch to it everywhere |
|
I really don't get the fact that the enforcer is rejecting the code, even though its ok locally. will have to comment out those lines for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. just a nit about adding a TODO. and a couple of questions, I don't understand why we need the unwrapping of CredentialInitializationException. As in what's wrapping that exception in the first place.
The wrapping of AuditExceptions is also not ideal..I'll look into if we can get that fixed in the SDK.
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ARetryPolicy.java
Show resolved
Hide resolved
|
|
||
| // translated AWS exceptions are retried if retryable() is true. | ||
| policyMap.put(AWSClientIOException.class, retryAwsClientExceptions); | ||
| policyMap.put(AWSServiceIOException.class, retryAwsClientExceptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we do switch to the new retry policy, doesn't this mean AWSServiceIOException will not be retried? I think previously these were retried, but not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's why I left this out. the thing is "retryable()" is meant to indicate when things can't be retried, and we've been getting complaints about useless retrying. but I've left it out of this change.
| * @param throwable exception | ||
| * @return a translated exception or null. | ||
| */ | ||
| public static IOException maybeTranslateCredentialException(String path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to do this? I get the AuditFailures are getting wrapped in a SDKClient exception, what's wrapping CredentialInitializationException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the audit and CredentialInitializationExceptions are both SdkClientException. they may both be wrapped by async handler, and while I didn't try to replicate it, if it happened in the same request I'd expect the same failure. [and not expected in sync s3 client].
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
right, i'm just going to comment out the enforcement, as a package wildcard didn't take either |
|
💔 -1 overall
This message was automatically generated. |
ac9bfef to
7b65140
Compare
|
💔 -1 overall
This message was automatically generated. |
ok, this is the new plan: merge everything into one patch |
7b65140 to
77a75cc
Compare
|
squashed everything to a single unified patch; doing a local build/test |
|
💔 -1 overall
This message was automatically generated. |
This is an aggregate patch of the changes from feature-HADOOP-18073-s3a-sdk-upgrade and moves the S3A connector to to using the V2 AWS SDK This is a major change: See aws_sdk_v2_changelog.md for details. A new shaded v2 SDK JAR "bundle.jar" needs to be distributed with the connector to interact with S3 stores All code which was using the V1 SDK classes with the S3AFileSystem will need upgrading. Contributed by Ahmar Suhail HADOOP-18820. Cut AWS v1 support (apache#5872) This removes the AWS V1 SDK as a hadoop-aws runtime dependency. It is still used at compile time so as to build a wrapper class V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider to be used for authentication. All well known credential providers have their classname remapped from v1 to v2 classes prior to instantiation; this wrapper is not needed for them. There is no support for migrating other SDK plugin points (signing, handlers) Access to the v2 S3Client class used by an S3A FileSystem instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals; other low-level operations (getObjectMetadata(Path)) have moved. Contributed by Steve Loughran HADOOP-18853. Upgrade AWS SDK version to 2.20.28 (apache#5960) Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail HADOOP-18818. Merge aws v2 upgrade feature branch into trunk Contains HADOOP-18863. AWS SDK V2 - AuditFailureExceptions aren't being translated properly Change-Id: I96b26cc1ee535c519248ca6541fb157017dcc7e4
77a75cc to
9cfef29
Compare
|
+fixing two of the checkstyles; merged back into a single patch |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
...to RetryFromAWSClientExceptionPolicy, because spotbugs rejects any class ending in Exception which isn't one. Change-Id: Ica3deb6edfcf3b0536e741b4bd6f4a5a8bc28dc8
|
ffs. renaming. |
|
💔 -1 overall
This message was automatically generated. |
|
ok, this merge is working. I will do the merge on friday, -I just want to make sure I've got the commit message right. |
…ames oops Change-Id: I993ccd2418e3f8ec77d967032f10a5a746a6f81b
|
💔 -1 overall
This message was automatically generated. |
This patch migrates the S3A connector to use the V2 AWS SDK. This is a significant change at the source code level. Any applications using the internal extension/override points in the filesystem connector are likely to break. This includes but is not limited to: - Code invoking methods on the S3AFileSystem class which used classes from the V1 SDK. - The ability to define the factory for the `AmazonS3` client, and to retrieve it from the S3AFileSystem. There is a new factory API and a special interface S3AInternals to access a limited set of internal classes and operations. - Delegation token and auditing extensions. - Classes trying to integrate with the AWS SDK. All standard V1 credential providers listed in the option fs.s3a.aws.credentials.provider will be automatically remapped to their V2 equivalent. Other V1 Credential Providers are supported, but only if the V1 SDK is added back to the classpath. The SDK Signing plugin has changed; all v1 signers are incompatible. There is no support for the S3 "v2" signing algorithm. Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2 equivalent, "bundle.jar", which is now exported by the hadoop-aws module. Consult the document aws_sdk_upgrade for the full details. Contributed by Ahmar Suhail + some bits by Steve Loughran
This patch migrates the S3A connector to use the V2 AWS SDK. This is a significant change at the source code level. Any applications using the internal extension/override points in the filesystem connector are likely to break. This includes but is not limited to: - Code invoking methods on the S3AFileSystem class which used classes from the V1 SDK. - The ability to define the factory for the `AmazonS3` client, and to retrieve it from the S3AFileSystem. There is a new factory API and a special interface S3AInternals to access a limited set of internal classes and operations. - Delegation token and auditing extensions. - Classes trying to integrate with the AWS SDK. All standard V1 credential providers listed in the option fs.s3a.aws.credentials.provider will be automatically remapped to their V2 equivalent. Other V1 Credential Providers are supported, but only if the V1 SDK is added back to the classpath. The SDK Signing plugin has changed; all v1 signers are incompatible. There is no support for the S3 "v2" signing algorithm. Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2 equivalent, "bundle.jar", which is now exported by the hadoop-aws module. Consult the document aws_sdk_upgrade for the full details. Contributed by Ahmar Suhail + some bits by Steve Loughran
This patch migrates the S3A connector to use the V2 AWS SDK. This is a significant change at the source code level. Any applications using the internal extension/override points in the filesystem connector are likely to break. This includes but is not limited to: - Code invoking methods on the S3AFileSystem class which used classes from the V1 SDK. - The ability to define the factory for the `AmazonS3` client, and to retrieve it from the S3AFileSystem. There is a new factory API and a special interface S3AInternals to access a limited set of internal classes and operations. - Delegation token and auditing extensions. - Classes trying to integrate with the AWS SDK. All standard V1 credential providers listed in the option fs.s3a.aws.credentials.provider will be automatically remapped to their V2 equivalent. Other V1 Credential Providers are supported, but only if the V1 SDK is added back to the classpath. The SDK Signing plugin has changed; all v1 signers are incompatible. There is no support for the S3 "v2" signing algorithm. Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2 equivalent, "bundle.jar", which is now exported by the hadoop-aws module. Consult the document aws_sdk_upgrade for the full details. Contributed by Ahmar Suhail + some bits by Steve Loughran
This patch migrates the S3A connector to use the V2 AWS SDK. This is a significant change at the source code level. Any applications using the internal extension/override points in the filesystem connector are likely to break. This includes but is not limited to: - Code invoking methods on the S3AFileSystem class which used classes from the V1 SDK. - The ability to define the factory for the `AmazonS3` client, and to retrieve it from the S3AFileSystem. There is a new factory API and a special interface S3AInternals to access a limited set of internal classes and operations. - Delegation token and auditing extensions. - Classes trying to integrate with the AWS SDK. All standard V1 credential providers listed in the option fs.s3a.aws.credentials.provider will be automatically remapped to their V2 equivalent. Other V1 Credential Providers are supported, but only if the V1 SDK is added back to the classpath. The SDK Signing plugin has changed; all v1 signers are incompatible. There is no support for the S3 "v2" signing algorithm. Finally, the aws-sdk-bundle JAR has been replaced by the shaded V2 equivalent, "bundle.jar", which is now exported by the hadoop-aws module. Consult the document aws_sdk_upgrade for the full details. Contributed by Ahmar Suhail + some bits by Steve Loughran
Description of PR
HADOOP-18818. Merge aws v2 upgrade feature branch into trunk
Fixed to work with TTL changes of HADOOP-18845 by removing the
tests.
Contains
HADOOP-18863. AWS SDK V2 - AuditFailureExceptions aren't being translated
properly
More specifically the superclass of CredentialInitializationException
converted to SdkClientException; the subclass AuditFailureException
will get this too.
That's not enough as currently the sdk wraps all exceptions; there's
double unwrapping with tests to validate.
Also, a new retry policy, RetryFromAWSClientIOException, was added to
use the .retryable() attribute of aws sdk exceptions to determine
whether or not to attempt retries.
This has not been switched too -its a big enough change it should
be validated/tested on its own.
not for merging
The PR chain will be pulled in.
How was this patch tested?
s3 london; retesting. with HADOOP-18863 the failing test is happy.
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?