-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-19286: S3A: Support cross region access when S3 region/endpoint is set #7067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🎊 +1 overall
This message was automatically generated. |
@ahmarsuhail @steveloughran Could you please review the changes ? |
final Configuration newConf = new Configuration(getConfiguration()); | ||
// skip the test if the region is eu-west-2 | ||
String region = getFileSystem().getS3AInternals().getBucketMetadata().bucketRegion(); | ||
if (EU_WEST_2.equals(region)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'd like a different region here a that is my region and I would like the test coverage.
- what happens with third party stores?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Ack.
- As per the doc (https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/s3-cross-region.html)
When you reference an existing bucket in a request, such as when you use the putObject method, the SDK initiates a request to the Region configured for the client.
If the bucket does not exist in that specific Region, the error response includes the actual Region where the bucket resides. The SDK then uses the correct Region in a second request.
To optimize future requests to the same bucket, the SDK caches this Region mapping in the client.
So as per the implementation, It looks like it won't be supported by thirdparty store and each store have to implement it separately.
// enable cross region access | ||
newConf.setBoolean(AWS_S3_CROSS_REGION_ACCESS_ENABLED, true); | ||
newConf.set(AWS_REGION, EU_WEST_2); | ||
S3AFileSystem fs = new S3AFileSystem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be in try-with-resources to close()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
final Configuration newConf = new Configuration(getConfiguration()); | ||
// enable cross region access | ||
newConf.setBoolean(AWS_S3_CROSS_REGION_ACCESS_ENABLED, true); | ||
newConf.set(AWS_REGION, EU_WEST_2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
🎊 +1 overall
This message was automatically generated. |
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
@steveloughran - I have addressed your comments regarding the integration tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are dynamically doing a probe, actually the reading comes back as "null" hadoop s3guard bucket-info prints this (tested with two different stores)
Bucket metadata from S3
Region null
Location Name null
Location Type null
I had in mind just having a constant string which we could check in the configuration; what do you have done discovers the information later, but it looks for that region==null actually it is more reliable.
Please change the crow to look for null value. Having the new constant is good as then we can declare that this is what you said to be excluded from other tests (conditional-write?) (actually, thinking of it, maybe we should have a feature flag enum. Not in this PR though.)
*/ | ||
private void skipCrossRegionTest() throws IOException { | ||
String region = getFileSystem().getS3AInternals().getBucketMetadata().bucketRegion(); | ||
if (SA_EAST_1.equals(region) || NON_AWS_REGION.equals(region)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please change the non-aws probe for just region == null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
💔 -1 overall
This message was automatically generated. |
The error seems unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code all good, now needs documentation updates. We can't require people trying to set up their app to read the source code.
connecting.md is the place for this. thanks.
@steveloughran Sure, it was in my todo list, but missed somehow. Thanks for pointing this out. I have done the doc changes in |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
thanks. can you do a PR for the backport and rerun the tests, then I will merge |
…t is set (apache#7067) Adds new option s3a.cross.region.access.enabled Which is true by default This enables cross region access as a separate config and enable/disables it irrespective of region/endpoint is set. Contributed by Syed Shameerur Rahman
Thanks a lot @steveloughran for the detailed review and merge. |
Currently when S3 region nor endpoint is set, the default region is set to us-east-2 with cross region access enabled. But when region or endpoint is set, cross region access is not enabled.
The proposal here is to carves out cross region access as a separate config and enable/disable it irrespective of region/endpoint is set. This gives more flexibility to the user.
S3 cross region access can be enabled/disabled via config
fs.s3a.cross.region.access.enabled
which is set to true by default.Tested with us-east-1