- 
                Notifications
    You must be signed in to change notification settings 
- Fork 9.1k
HADOOP-18853. Upgrades SDK version to 2.20.28 and restores multipart copy #5960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18853. Upgrades SDK version to 2.20.28 and restores multipart copy #5960
Conversation
| @steveloughran this updates the SDK version, so the Java async client has MPU. Since Multipart operations are in the java async client now, and not the TM, we could consider removing the TM in a follow up PR. We'd lose out on the transfer listener..but we don't really use any of the other additional functionality the TM provides us. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good though I want my #5872 patch in first -which is depending on approval from someone with commit rights.
please make sure jira and pr title states what library is being updated and what is it changing too, e.g
Upgrade AWS SDK to 2.19.12 and restore multipart copy
Once this PR is merged in the v2 line is feature complete w.r.t the v1 SDK code, isn't it? which means it'll be time to rebase the branch then merge to trunk, plus same for the 3.3 backport!
| Configuration conf = getConf(); | ||
| String bucket = uri.getHost(); | ||
|  | ||
| NettyNioAsyncHttpClient.Builder httpClientBuilder = AWSClientConfig | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this going to build properly with unshaded artifacts? we've caused problems in the path (#2599) because of refs to .shaded classes.. the netty and client stuff are public/stable unshaded classes, correct?
| 💔 -1 overall 
 
 This message was automatically generated. | 
| @ahmarsuhail can you rebase and retest this, then i'll merge. once in we can rebase the whole feature branch, retest -and then merge to trunk! | 
6a328d3    to
    54dcf48      
    Compare
  
    | @steveloughran  - Rebased and tested, mostly all good.  Also seeing failures in  Fails on my EC2, but works ok on my mac. It's probably a config issue, but haven't seen this ever before so need to investigate. Again, don't think it has anything to do with this change. so feel like we should be good to merge. Once we're in trunk, I'll work on adding CSE, that's the only thing missing now. | 
| 💔 -1 overall 
 
 This message was automatically generated. | 
| your ec2 vm is using IAM session credentials, isn't it? so it can't issue full credential DTs. probably worth having the test skip if that's the case | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good.
needs matching changes in LICENSE-binary; update the bundle version and cut the crt
        
          
                hadoop-project/pom.xml
              
                Outdated
          
        
      | <hsqldb.version>2.7.1</hsqldb.version> | ||
| <aws-java-sdk-v2.version>2.19.12</aws-java-sdk-v2.version> | ||
| <aws-java-sdk-v2.version>2.20.128</aws-java-sdk-v2.version> | ||
| <aws.evenstream.version>1.0.1</aws.evenstream.version> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while you are here, how about renaming evenstream to eventstream; only just noticed this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
+1
| 💔 -1 overall 
 
 
 This message was automatically generated. | 
Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail
Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail
Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail
This is an aggregate patch of the changes from feature-HADOOP-18073-s3a-sdk-upgrade and moves the S3A connector to to using the V2 AWS SDK This is a major change: See aws_sdk_v2_changelog.md for details. A new shaded v2 SDK JAR "bundle.jar" needs to be distributed with the connector to interact with S3 stores All code which was using the V1 SDK classes with the S3AFileSystem will need upgrading. Contributed by Ahmar Suhail HADOOP-18820. Cut AWS v1 support (apache#5872) This removes the AWS V1 SDK as a hadoop-aws runtime dependency. It is still used at compile time so as to build a wrapper class V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider to be used for authentication. All well known credential providers have their classname remapped from v1 to v2 classes prior to instantiation; this wrapper is not needed for them. There is no support for migrating other SDK plugin points (signing, handlers) Access to the v2 S3Client class used by an S3A FileSystem instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals; other low-level operations (getObjectMetadata(Path)) have moved. Contributed by Steve Loughran HADOOP-18853. Upgrade AWS SDK version to 2.20.28 (apache#5960) Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail HADOOP-18818. Merge aws v2 upgrade feature branch into trunk Contains HADOOP-18863. AWS SDK V2 - AuditFailureExceptions aren't being translated properly Change-Id: I96b26cc1ee535c519248ca6541fb157017dcc7e4
This is an aggregate patch of the changes from feature-HADOOP-18073-s3a-sdk-upgrade and moves the S3A connector to to using the V2 AWS SDK This is a major change: See aws_sdk_v2_changelog.md for details. A new shaded v2 SDK JAR "bundle.jar" needs to be distributed with the connector to interact with S3 stores All code which was using the V1 SDK classes with the S3AFileSystem will need upgrading. Contributed by Ahmar Suhail HADOOP-18820. Cut AWS v1 support (apache#5872) This removes the AWS V1 SDK as a hadoop-aws runtime dependency. It is still used at compile time so as to build a wrapper class V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider to be used for authentication. All well known credential providers have their classname remapped from v1 to v2 classes prior to instantiation; this wrapper is not needed for them. There is no support for migrating other SDK plugin points (signing, handlers) Access to the v2 S3Client class used by an S3A FileSystem instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals; other low-level operations (getObjectMetadata(Path)) have moved. Contributed by Steve Loughran HADOOP-18853. Upgrade AWS SDK version to 2.20.28 (apache#5960) Upgrades the AWS sdk v2 version to 2.20.28 This * adds multipart COPY/rename in the java async client * removes the aws-crt JAR dependency Contributed by Ahmar Suhail HADOOP-18818. Merge aws v2 upgrade feature branch into trunk Contains HADOOP-18863. AWS SDK V2 - AuditFailureExceptions aren't being translated properly Change-Id: I96b26cc1ee535c519248ca6541fb157017dcc7e4
Description of PR
How was this patch tested?
Tested in eu-west-1 with
mvn -Dparallel-tests -DtestsThreadCount=16 clean verify.Also ran checked output of ITestS3HugeFileArrayBlocks.test_100_renameHugeFile(), time taken to rename a 256MB file is at par with V1, around ~1s from my m4.2xlarge EC2.