Skip to content

Commit a4c23fd

Browse files
committed
HADOOP-18820. Cut AWS v1 support (apache#5872)
This removes the AWS V1 SDK as a hadoop-aws runtime dependency. It is still used at compile time so as to build a wrapper class V1ToV2AwsCredentialProviderAdapter which allows v1 credential provider to be used for authentication. All well known credential providers have their classname remapped from v1 to v2 classes prior to instantiation; this wrapper is not needed for them. There is no support for migrating other SDK plugin points (signing, handlers) Access to the v2 S3Client class used by an S3A FileSystem instance is now via a new interface org.apache.hadoop.fs.s3a.S3AInternals; other low-level operations (getObjectMetadata(Path)) have moved. Contributed by Steve Loughran Change-Id: I231fade61b20a206a018d18f5e867f5d3a1fa879
1 parent d4deb9b commit a4c23fd

File tree

70 files changed

+2308
-1227
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+2308
-1227
lines changed

LICENSE-binary

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,6 @@ com.aliyun:aliyun-java-sdk-kms:2.11.0
214214
com.aliyun:aliyun-java-sdk-ram:3.1.0
215215
com.aliyun:aliyun-java-sdk-sts:3.0.0
216216
com.aliyun.oss:aliyun-sdk-oss:3.13.0
217-
com.amazonaws:aws-java-sdk-bundle:1.12.367
218217
com.cedarsoftware:java-util:1.9.0
219218
com.cedarsoftware:json-io:2.5.1
220219
com.fasterxml.jackson.core:jackson-annotations:2.12.7
@@ -368,6 +367,8 @@ org.objenesis:objenesis:2.6
368367
org.xerial.snappy:snappy-java:1.1.10.1
369368
org.yaml:snakeyaml:2.0
370369
org.wildfly.openssl:wildfly-openssl:1.1.3.Final
370+
software.amazon.awssdk:bundle:jar:2.19.12
371+
software.amazon.awssdk.crt:aws-crt:0.21.0
371372

372373

373374
--------------------------------------------------------------------------------

hadoop-common-project/hadoop-common/src/main/resources/core-default.xml

Lines changed: 11 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1201,61 +1201,31 @@
12011201
<description>AWS secret key used by S3A file system. Omit for IAM role-based or provider-based authentication.</description>
12021202
</property>
12031203

1204+
<property>
1205+
<name>fs.s3a.session.token</name>
1206+
<description>Session token, when using org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
1207+
as one of the providers.
1208+
</description>
1209+
</property>
1210+
12041211
<property>
12051212
<name>fs.s3a.aws.credentials.provider</name>
12061213
<value>
12071214
org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider,
12081215
org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider,
1209-
com.amazonaws.auth.EnvironmentVariableCredentialsProvider,
1216+
software.amazon.awssdk.auth.credentials.EnvironmentVariableCredentialsProvider,
12101217
org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider
12111218
</value>
12121219
<description>
12131220
Comma-separated class names of credential provider classes which implement
1214-
com.amazonaws.auth.AWSCredentialsProvider.
1221+
software.amazon.awssdk.auth.credentials.AwsCredentialsProvider.
12151222

12161223
When S3A delegation tokens are not enabled, this list will be used
12171224
to directly authenticate with S3 and other AWS services.
12181225
When S3A Delegation tokens are enabled, depending upon the delegation
12191226
token binding it may be used
12201227
to communicate wih the STS endpoint to request session/role
12211228
credentials.
1222-
1223-
These are loaded and queried in sequence for a valid set of credentials.
1224-
Each listed class must implement one of the following means of
1225-
construction, which are attempted in order:
1226-
* a public constructor accepting java.net.URI and
1227-
org.apache.hadoop.conf.Configuration,
1228-
* a public constructor accepting org.apache.hadoop.conf.Configuration,
1229-
* a public static method named getInstance that accepts no
1230-
arguments and returns an instance of
1231-
com.amazonaws.auth.AWSCredentialsProvider, or
1232-
* a public default constructor.
1233-
1234-
Specifying org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider allows
1235-
anonymous access to a publicly accessible S3 bucket without any credentials.
1236-
Please note that allowing anonymous access to an S3 bucket compromises
1237-
security and therefore is unsuitable for most use cases. It can be useful
1238-
for accessing public data sets without requiring AWS credentials.
1239-
1240-
If unspecified, then the default list of credential provider classes,
1241-
queried in sequence, is:
1242-
* org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider: looks
1243-
for session login secrets in the Hadoop configuration.
1244-
* org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider:
1245-
Uses the values of fs.s3a.access.key and fs.s3a.secret.key.
1246-
* com.amazonaws.auth.EnvironmentVariableCredentialsProvider: supports
1247-
configuration of AWS access key ID and secret access key in
1248-
environment variables named AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY,
1249-
and AWS_SESSION_TOKEN as documented in the AWS SDK.
1250-
* org.apache.hadoop.fs.s3a.auth.IAMInstanceCredentialsProvider: picks up
1251-
IAM credentials of any EC2 VM or AWS container in which the process is running.
1252-
</description>
1253-
</property>
1254-
1255-
<property>
1256-
<name>fs.s3a.session.token</name>
1257-
<description>Session token, when using org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider
1258-
as one of the providers.
12591229
</description>
12601230
</property>
12611231

@@ -1353,10 +1323,10 @@
13531323
Note: for job submission to actually collect these tokens,
13541324
Kerberos must be enabled.
13551325

1356-
Options are:
1326+
Bindings available in hadoop-aws are:
13571327
org.apache.hadoop.fs.s3a.auth.delegation.SessionTokenBinding
13581328
org.apache.hadoop.fs.s3a.auth.delegation.FullCredentialsTokenBinding
1359-
and org.apache.hadoop.fs.s3a.auth.delegation.RoleTokenBinding
1329+
org.apache.hadoop.fs.s3a.auth.delegation.RoleTokenBinding
13601330
</description>
13611331
</property>
13621332

hadoop-project/pom.xml

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,7 @@
189189
<aws-java-sdk.version>1.12.367</aws-java-sdk.version>
190190
<hsqldb.version>2.7.1</hsqldb.version>
191191
<aws-java-sdk-v2.version>2.19.12</aws-java-sdk-v2.version>
192+
<aws.evenstream.version>1.0.1</aws.evenstream.version>
192193
<awscrt.version>0.21.0</awscrt.version>
193194
<frontend-maven-plugin.version>1.11.2</frontend-maven-plugin.version>
194195
<jasmine-maven-plugin.version>2.1</jasmine-maven-plugin.version>
@@ -1111,18 +1112,29 @@
11111112
<groupId>com.amazonaws</groupId>
11121113
<artifactId>aws-java-sdk-core</artifactId>
11131114
<version>${aws-java-sdk.version}</version>
1115+
<exclusions>
1116+
<exclusion>
1117+
<groupId>*</groupId>
1118+
<artifactId>*</artifactId>
1119+
</exclusion>
1120+
</exclusions>
11141121
</dependency>
11151122
<dependency>
11161123
<groupId>software.amazon.awssdk</groupId>
11171124
<artifactId>bundle</artifactId>
11181125
<version>${aws-java-sdk-v2.version}</version>
11191126
<exclusions>
11201127
<exclusion>
1121-
<groupId>io.netty</groupId>
1128+
<groupId>*</groupId>
11221129
<artifactId>*</artifactId>
11231130
</exclusion>
11241131
</exclusions>
11251132
</dependency>
1133+
<dependency>
1134+
<groupId>software.amazon.eventstream</groupId>
1135+
<artifactId>eventstream</artifactId>
1136+
<version>${aws.evenstream.version}</version>
1137+
</dependency>
11261138
<dependency>
11271139
<groupId>software.amazon.awssdk.crt</groupId>
11281140
<artifactId>aws-crt</artifactId>

hadoop-tools/hadoop-aws/dev-support/findbugs-exclude.xml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,11 @@
6464
<Field name="futurePool"/>
6565
<Bug pattern="IS2_INCONSISTENT_SYNC"/>
6666
</Match>
67+
<Match>
68+
<Class name="org.apache.hadoop.fs.s3a.S3AFileSystem"/>
69+
<Field name="s3AsyncClient"/>
70+
<Bug pattern="IS2_INCONSISTENT_SYNC"/>
71+
</Match>
6772
<Match>
6873
<Class name="org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$BucketInfo"/>
6974
<Method name="run"/>

hadoop-tools/hadoop-aws/pom.xml

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,6 +463,16 @@
463463
<bannedImport>org.apache.hadoop.mapred.**</bannedImport>
464464
</bannedImports>
465465
</restrictImports>
466+
<restrictImports>
467+
<includeTestCode>false</includeTestCode>
468+
<reason>Restrict AWS v1 imports to adapter code</reason>
469+
<exclusions>
470+
<exclusion>org.apache.hadoop.fs.s3a.adapter.V1ToV2AwsCredentialProviderAdapter</exclusion>
471+
</exclusions>
472+
<bannedImports>
473+
<bannedImport>com.amazonaws.**</bannedImport>
474+
</bannedImports>
475+
</restrictImports>
466476
</rules>
467477
</configuration>
468478
</execution>
@@ -483,10 +493,14 @@
483493
<scope>test</scope>
484494
<type>test-jar</type>
485495
</dependency>
496+
497+
<!-- The v1 SDK is used at compilation time for adapter classes in
498+
org.apache.hadoop.fs.s3a.adapter. It is not needed at runtime
499+
unless a non-standard v1 credential provider is declared. -->
486500
<dependency>
487501
<groupId>com.amazonaws</groupId>
488502
<artifactId>aws-java-sdk-core</artifactId>
489-
<scope>compile</scope>
503+
<scope>provided</scope>
490504
</dependency>
491505
<dependency>
492506
<groupId>software.amazon.awssdk</groupId>
@@ -496,7 +510,11 @@
496510
<dependency>
497511
<groupId>software.amazon.awssdk.crt</groupId>
498512
<artifactId>aws-crt</artifactId>
499-
<scope>compile</scope>
513+
</dependency>
514+
<dependency>
515+
<groupId>software.amazon.eventstream</groupId>
516+
<artifactId>eventstream</artifactId>
517+
<scope>test</scope>
500518
</dependency>
501519
<dependency>
502520
<groupId>org.assertj</groupId>

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/AWSCredentialProviderList.java

Lines changed: 14 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -27,27 +27,21 @@
2727
import java.util.concurrent.atomic.AtomicInteger;
2828
import java.util.stream.Collectors;
2929

30-
import com.amazonaws.auth.AWSCredentials;
31-
import com.amazonaws.auth.AWSCredentialsProvider;
32-
import com.amazonaws.auth.BasicAWSCredentials;
33-
import com.amazonaws.auth.BasicSessionCredentials;
34-
import org.apache.hadoop.fs.s3a.adapter.V1V2AwsCredentialProviderAdapter;
35-
import org.apache.hadoop.thirdparty.com.google.common.annotations.VisibleForTesting;
36-
import org.apache.hadoop.thirdparty.com.google.common.base.Preconditions;
3730
import org.slf4j.Logger;
3831
import org.slf4j.LoggerFactory;
3932

4033
import org.apache.commons.lang3.StringUtils;
34+
import org.apache.hadoop.classification.VisibleForTesting;
4135
import org.apache.hadoop.classification.InterfaceAudience;
4236
import org.apache.hadoop.classification.InterfaceStability;
4337
import org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException;
4438
import org.apache.hadoop.fs.s3a.auth.NoAwsCredentialsException;
4539
import org.apache.hadoop.io.IOUtils;
40+
import org.apache.hadoop.util.Preconditions;
4641

4742
import software.amazon.awssdk.auth.credentials.AnonymousCredentialsProvider;
4843
import software.amazon.awssdk.auth.credentials.AwsCredentials;
4944
import software.amazon.awssdk.auth.credentials.AwsCredentialsProvider;
50-
import software.amazon.awssdk.auth.credentials.AwsSessionCredentials;
5145
import software.amazon.awssdk.core.exception.SdkException;
5246

5347
/**
@@ -105,23 +99,8 @@ public AWSCredentialProviderList() {
10599
* @param providers provider list.
106100
*/
107101
public AWSCredentialProviderList(
108-
Collection<AWSCredentialsProvider> providers) {
109-
for (AWSCredentialsProvider provider: providers) {
110-
this.providers.add(V1V2AwsCredentialProviderAdapter.adapt(provider));
111-
}
112-
}
113-
114-
/**
115-
* Create with an initial list of providers.
116-
* @param name name for error messages, may be ""
117-
* @param providerArgs provider list.
118-
*/
119-
public AWSCredentialProviderList(final String name,
120-
final AWSCredentialsProvider... providerArgs) {
121-
setName(name);
122-
for (AWSCredentialsProvider provider: providerArgs) {
123-
this.providers.add(V1V2AwsCredentialProviderAdapter.adapt(provider));
124-
}
102+
Collection<AwsCredentialsProvider> providers) {
103+
this.providers.addAll(providers);
125104
}
126105

127106
/**
@@ -147,14 +126,6 @@ public void setName(final String name) {
147126
}
148127
}
149128

150-
/**
151-
* Add a new provider.
152-
* @param provider provider
153-
*/
154-
public void add(AWSCredentialsProvider provider) {
155-
providers.add(V1V2AwsCredentialProviderAdapter.adapt(provider));
156-
}
157-
158129
/**
159130
* Add a new SDK V2 provider.
160131
* @param provider provider
@@ -163,7 +134,6 @@ public void add(AwsCredentialsProvider provider) {
163134
providers.add(provider);
164135
}
165136

166-
167137
/**
168138
* Add all providers from another list to this one.
169139
* @param other the other list.
@@ -173,19 +143,11 @@ public void addAll(AWSCredentialProviderList other) {
173143
}
174144

175145
/**
176-
* This method will get credentials using SDK V2's resolveCredentials and then convert it into
177-
* V1 credentials. This required by delegation token binding classes.
178-
* @return SDK V1 credentials
146+
* Was an implementation of the v1 refresh; now just
147+
* a no-op.
179148
*/
180-
public AWSCredentials getCredentials() {
181-
AwsCredentials credentials = resolveCredentials();
182-
if (credentials instanceof AwsSessionCredentials) {
183-
return new BasicSessionCredentials(credentials.accessKeyId(),
184-
credentials.secretAccessKey(),
185-
((AwsSessionCredentials) credentials).sessionToken());
186-
} else {
187-
return new BasicAWSCredentials(credentials.accessKeyId(), credentials.secretAccessKey());
188-
}
149+
@Deprecated
150+
public void refresh() {
189151
}
190152

191153
/**
@@ -256,8 +218,7 @@ public AwsCredentials resolveCredentials() {
256218
*
257219
* @return providers
258220
*/
259-
@VisibleForTesting
260-
List<AwsCredentialsProvider> getProviders() {
221+
public List<AwsCredentialsProvider> getProviders() {
261222
return providers;
262223
}
263224

@@ -289,9 +250,11 @@ public String listProviderNames() {
289250
*/
290251
@Override
291252
public String toString() {
292-
return "AWSCredentialProviderList[" +
293-
name +
294-
"refcount= " + refCount.get() + ": [" +
253+
return "AWSCredentialProviderList"
254+
+ " name=" + name
255+
+ "; refcount= " + refCount.get()
256+
+ "; size="+ providers.size()
257+
+ ": [" +
295258
StringUtils.join(providers, ", ") + ']'
296259
+ (lastProvider != null ? (" last provider: " + lastProvider) : "");
297260
}

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -727,11 +727,19 @@ private Constants() {
727727
public static final String STREAM_READ_GAUGE_INPUT_POLICY =
728728
"stream_read_gauge_input_policy";
729729

730+
/**
731+
* S3 Client Factory implementation class: {@value}.
732+
* Unstable and incompatible between v1 and v2 SDK versions.
733+
*/
730734
@InterfaceAudience.Private
731735
@InterfaceStability.Unstable
732736
public static final String S3_CLIENT_FACTORY_IMPL =
733737
"fs.s3a.s3.client.factory.impl";
734738

739+
/**
740+
* Default factory:
741+
* {@code org.apache.hadoop.fs.s3a.DefaultS3ClientFactory}.
742+
*/
735743
@InterfaceAudience.Private
736744
@InterfaceStability.Unstable
737745
public static final Class<? extends S3ClientFactory>

0 commit comments

Comments
 (0)