Skip to content

Conversation

@virajjasani
Copy link
Contributor

Jira: HADOOP-19044

@virajjasani
Copy link
Contributor Author

Testing against us-west-2 in progress.

@virajjasani
Copy link
Contributor Author

virajjasani commented Jan 22, 2024

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

Several tests failing, cause is being discussed on HADOOP-18975

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 33s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 11s trunk passed
+1 💚 shadedclient 39m 16s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 28s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 28s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 34s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s the patch passed
+1 💚 shadedclient 38m 44s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
141m 28s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/1/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux f9737227e6bc 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 872dcac
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/1/testReport/
Max. process+thread count 526 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 11m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 28s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s trunk passed
+1 💚 shadedclient 33m 24s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s the patch passed
+1 💚 shadedclient 33m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 14s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
137m 15s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/2/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 82432cd62ca6 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d8c9793
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/2/testReport/
Max. process+thread count 653 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 36s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 43s trunk passed
+1 💚 compile 0m 39s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 31s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 38s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 33m 40s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 18s the patch passed
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 33m 43s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 13s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
126m 24s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/3/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e36bac675ddd 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cefb30b
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/3/testReport/
Max. process+thread count 640 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 34s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 40m 58s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 5s trunk passed
+1 💚 shadedclient 32m 10s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 32m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 1s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
121m 25s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/4/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 3870a1365bc6 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 81d345b
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/4/testReport/
Max. process+thread count 551 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

@HarshitGupta11 has a PR for this too. Harshit, can you share yours?

also I'd like @ahmarsuhail to review this as when it comes to region stuff and v2 sdk I'm still permanently confused about the corner cases. Could make it a job interview question for anyone claiming deep knowledge of AWS

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @virajjasani, left some comments.

Prefer this to @HarshitGupta11's PR as I think that sets the region to US_EAST_2 whenever endpoint=s3.amazonaws.com, regardless of what is in fs.s3a.endpoint.region.

origin = "SDK region chain";
}

if (endpointStr != null && endpointStr.endsWith(CENTRAL_ENDPOINT)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of this, I think what we should do is on line 307, where it says:

    if (region != null) {
      builder.region(region);
    } 

do

    if (region != null) {
      builder.region(region);
      if (region == US_EAST_1 && endpointStr != null && endpointStr.endsWith(CENTRAL_ENDPOINT)) {
        builder.crossRegionAccessEnabled(true);
       LOG.debug("Enabling cross region access for endpoint {}", endpointStr);
      }
    } 

you will only get into that if block if you haven't set fs.s3a.endpoint.region and a region could be determined from your fs.s3a.endpoint.region , which will happen in this case. As getS3RegionFromEndpoint will return US_EAST_1 for s3.amazonaws.com.

Currently we are setting cross region enabled whenever fs.s3a.endpoint = s3.amazonaws.com, even if you know your region and you've set it correctly in fs.s3a.endpoint.region.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's also set the region to US_EAST_2 when enabling cross region. When I was doing this work a few months ago, I found that cross region with US_EAST_1 was showing some weird behaviours..I didn't dive into them at the time. Everything worked as expected with US_EAST_2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting!
You mean to say, with US_EAST_1 cross region, we have some intermittent issues while accessing bucket from other region? Or we always have issues?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I saw issues..403's with the landstat buckets iirc. Could be related to this. I don't know if those issues still exist, it could be worth testing with US_EAST_1 to see. It shouldn't make a difference what we set though, if you haven't got the correct region SDK will make a call to figure out the correct region and then cache it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this look good for source changes 05225bb ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

landsat bucket seems to have been closed off; we will need to move off it -but the replacement must be something other than us-east so it stresses more of the system

public class ITestS3ACrossRegionAccess extends AbstractS3ATestBase {

@Test
public void testCentralEndpointCrossRegionAccess() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be better to move ITestsS3AEndpoint instead of creating a new test class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that test, we are not trying to create/write/read file using fs, so i thought of keeping this separate. This test fails during mkdir with 400 without the source change.


newConf.set(ENDPOINT, CENTRAL_ENDPOINT);

try (S3AFileSystem newFs = new S3AFileSystem()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you look at the tests in ITestS3AEndpointRegion you can see how we do it there, instead of creating anything, we just intercept the request and check if things are being set correctly.

For this what we want to see is that if the central point is configured, and no region is configured .. region gets set to US_EAST_2 for cross region.

But if central endpoint is configured, and region is configured to US_EAST_1 , then region is US_EAST_1. that is region config takes precedence. See testCentralEndpoint in that class, does something similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked that test but since there was no real fs operation involved, i thought of keeping this as separate test. Does that work?
I can keep this test class as is and maybe write one more test in ITestS3AEndpointRegion as well without much of file operation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do a mdkir to verify behaviour though? won't just doing a headBucket also work? I think that would also fail without your change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have our custom RegionInterceptor, we will anyways not be able to make real request right? So regardless of whether we enable cross region endpoint, ITestS3AEndpointRegion is not going to help make real head request, that's why i thought of creating new class where regardless of the endpoint/region used to create bucket, new fs with central endpoint is able to perform file operations on the bucket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the stacktrace here for reference:

org.apache.hadoop.fs.s3a.AWSBadRequestException: getFileStatus on s3a://${bucket}/user/vjasani/basePath-testCentralEndpointCrossRegionAccess/srcdir: software.amazon.awssdk.services.s3.model.S3Exception: The authorization header is malformed; the region 'us-east-2' is wrong; expecting 'us-west-2' (Service: S3, Status Code: 400, Request ID: G85CNFC579T4MJ76, Extended Request ID: xrYGGqXdYtr72cYyFN3v4yemDxBCYkdt8mYd8cGItNhdx1EmZMLxMhwJTwzmWZT6ershid/WT4w=):AuthorizationHeaderMalformed: The authorization header is malformed; the region 'us-east-2' is wrong; expecting 'us-west-2' (Service: S3, Status Code: 400, Request ID: G85CNFC579T4MJ76, Extended Request ID: xrYGGqXdYtr72cYyFN3v4yemDxBCYkdt8mYd8cGItNhdx1EmZMLxMhwJTwzmWZT6ershid/WT4w=)

	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:259)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:154)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4075)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3934)
	at org.apache.hadoop.fs.s3a.S3AFileSystem$MkdirOperationCallbacksImpl.probePathStatus(S3AFileSystem.java:3806)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.probePathStatusOrNull(MkdirOperation.java:173)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.getPathStatusExpectingDir(MkdirOperation.java:194)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.execute(MkdirOperation.java:108)
	at org.apache.hadoop.fs.s3a.impl.MkdirOperation.execute(MkdirOperation.java:57)
	at org.apache.hadoop.fs.s3a.impl.ExecutingStoreOperation.apply(ExecutingStoreOperation.java:76)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2719)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2738)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:3778)
	at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2494)
	at org.apache.hadoop.fs.s3a.ITestS3ACrossRegionAccess.testCentralEndpointCrossRegionAccess(ITestS3ACrossRegionAccess.java:54)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to nitpick here! but in ITestS3AEndpointRegion, there is also a test currently that uses the FS, see testWithoutRegionConfig.

You can add another test there that does something like:

    Configuration conf = getConfiguration();
    removeBaseAndBucketOverrides(conf, ENDPOINT, AWS_REGION);
    conf.set(ENDPOINT, CENTRAL_ENDPOINT);
    
    newFS = new S3AFileSystem();
    newFS.initialize(getFileSystem().getUri(), conf);

    newFS.create(methodPath()).close();

This will fail without your source changes, but passes with them.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 56s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 27s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s trunk passed
+1 💚 shadedclient 37m 41s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 9s the patch passed
+1 💚 shadedclient 37m 49s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 56s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
139m 24s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/5/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ff4eb72e6a5b 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 05225bb
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/5/testReport/
Max. process+thread count 527 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

// endpoint is for US_EAST_1;
return Region.US_EAST_1;
// endpoint is for US_EAST_2;
return Region.US_EAST_2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing this causes confusion. Maybe its better to the use the variable present in Constants.

@mukund-thakur
Copy link
Contributor

also I'd like @ahmarsuhail to review this as when it comes to region stuff and v2 sdk I'm still permanently confused about the corner cases. Could make it a job interview question for anyone claiming deep knowledge of AWS

lol I was going to comment on something similar. This is the most confusing code in whole of hadoop-aws module.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 43s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 38m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 30s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 20s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 46s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
139m 30s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/6/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 55afed483dee 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2c653da
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/6/testReport/
Max. process+thread count 527 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 34s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 45m 12s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 11s trunk passed
+1 💚 shadedclient 33m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 35s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 35s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 20s the patch passed
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 36m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 10s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
131m 27s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/7/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 0c4c880a3f74 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 99cf2d4
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/7/testReport/
Max. process+thread count 710 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

* The special S3 region which can be used to talk to any bucket.
* Value {@value}.
*/
public static final String AWS_S3_CENTRAL_REGION = "us-east-1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already a default region below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kept both constants, with AWS_S3_CENTRAL_REGION = AWS_S3_DEFAULT_REGION

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not change these constants. AWS_S3_CENTRAL_REGION should be us-east-1 as that is the central/global region.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 36s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 29s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 32s trunk passed
+1 💚 mvnsite 0m 39s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 4s trunk passed
+1 💚 shadedclient 32m 8s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 32m 18s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 59s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
122m 7s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/8/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 7857be08074e 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 12ff2f1
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/8/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 49s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 34s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 37m 40s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 35s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 35s the patch passed
+1 💚 compile 0m 27s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 27s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 20s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 37m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 59s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
138m 56s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/9/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux e43bfd045fdb 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fcf56cf
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/9/testReport/
Max. process+thread count 554 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there, suggested some more minor changes.

* The special S3 region which can be used to talk to any bucket.
* Value {@value}.
*/
public static final String AWS_S3_CENTRAL_REGION = "us-east-1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's not change these constants. AWS_S3_CENTRAL_REGION should be us-east-1 as that is the central/global region.

// endpoint is for US_EAST_1;
return Region.US_EAST_1;
// endpoint for central region
return Region.of(AWS_S3_CENTRAL_REGION);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than changing constants, return the AWS_S3_DEFAULT_REGION here.


// endpoint is for US_EAST_1;
return Region.US_EAST_1;
// endpoint for central region
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a detailed comment here now why we have to do this with a link to the SPARK issue. Something along the lines of "If no region or endpoint is set, Spark will set the endpoint to s3.amazonaws.com. Since we do not know the region at this point, use the default region and enable cross region access"


newConf.set(ENDPOINT, CENTRAL_ENDPOINT);

try (S3AFileSystem newFs = new S3AFileSystem()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to nitpick here! but in ITestS3AEndpointRegion, there is also a test currently that uses the FS, see testWithoutRegionConfig.

You can add another test there that does something like:

    Configuration conf = getConfiguration();
    removeBaseAndBucketOverrides(conf, ENDPOINT, AWS_REGION);
    conf.set(ENDPOINT, CENTRAL_ENDPOINT);
    
    newFS = new S3AFileSystem();
    newFS.initialize(getFileSystem().getUri(), conf);

    newFS.create(methodPath()).close();

This will fail without your source changes, but passes with them.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more comments based on the response on the SDK issue.

// No region was configured, try to determine it from the endpoint.
if (region == null) {
region = getS3RegionFromEndpoint(parameters.getEndpoint());
boolean endpointEndsWithCentral =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so from the response on SDK issue, it looks like we don't want to override if endpoint is s3.amazonaws.com and region is not US_EAST_1.

Currently, if fs.s3a.endpoint is s3.amazonaws.com and fs.s3a.endpoint.region is eu-west-1 for example, we will end up overriding and end up with the same 400 errors.

What if we never override if endpoint is s3.amazonaws.com?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we never override if endpoint is s3.amazonaws.com?

That sounds right, let me test with various combinations of endpoint and region before making the changes:

e.g.

  1. endpoint central and region null
  2. endpoint central and region anything other than us-east-2
  3. endpoint central and region us-east-2
  4. endpoint null and region null
  5. endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)
  6. endpoint s3-us-east-1.amazonaws.com and region us-east-1 (and null)

@virajjasani
Copy link
Contributor Author

virajjasani commented Jan 30, 2024

Tests with bucket created on us-west-2, and endpoint/region configured:

  1. endpoint central and region null
    able to perform all operations, as expected

  2. endpoint central and region set to us-west-1
    able to perform all operations, as expected (because now we ignore region configured for central endpoint)

  3. endpoint central and region us-east-2
    able to perform all operations, as expected

  4. endpoint null and region null
    able to perform all operations, as expected (default region is also us-east-2 with cross-region access)

  5. endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)
    unable to perform any operation, as expected (no central endpoint, no cross-region access)

  6. endpoint s3-us-west-2.amazonaws.com and region us-west-2 (and null)
    able to perform all operations, as expected (no central endpoint, no cross-region access)

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 12m 47s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 50s trunk passed
+1 💚 compile 0m 42s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 javadoc 0m 28s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 32m 37s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 16s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 32m 34s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 13s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
136m 56s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/15/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 9dae1381ea3d 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 5fdb23d
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/15/testReport/
Max. process+thread count 699 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/15/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for the effort on this so far @virajjasani, really appreciated. sorry this is having to go through so many revisions and testing phases. just suggested a change to the logic, let me know what you think

describe("Create a client with the central endpoint but also specify region");
Configuration conf = getConfiguration();

S3Client client = createS3Client(conf, CENTRAL_ENDPOINT, US_WEST_2, US_EAST_2, false);
Copy link
Contributor

@ahmarsuhail ahmarsuhail Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for example here, if configured region is US_WEST_2, expected region should also be US_WEST_2, not US_EAST_2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in #6466 I'm going to propose we make the static methods accessible and unit tests to validate them, because

  • this stuff is so important and complicated we need it running on every pr
  • everyone's ITest setup is different, so may miss things.

endpointStr.endsWith(CENTRAL_ENDPOINT);
// No region was configured or the endpoint is central,
// determine the region from the endpoint.
if (region == null || endpointEndsWithCentral) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, not sure about this. now we're parsing region if region is null or endpoint = s3.amazonaws.com.

So if you set s3.amazonaws.com and region to eu-west-2, you still end up with us setting the region to us-east-2 and cross region enabled. My thinking here is that a lot of people may have endpoint set to s3.amazonaws.com (as atleast with SDK V1 it was harmless to do that I think) .

we only want to get into this parsing if region == null. so let's revert to the previous condition here. And then we never don't want to override if the endpoint is s3.amazonaws.com. Suggested:

    if (endpoint != null) {
      checkArgument(!fipsEnabled,
          "%s : %s", ERROR_ENDPOINT_WITH_FIPS, endpoint);
      boolean endpointEndsWithCentral =
          endpointStr.endsWith(CENTRAL_ENDPOINT);
      // No region was configured or the endpoint is central,
      // determine the region from the endpoint.
      if (region == null) {
        region = getS3RegionFromEndpoint(endpointStr,
            endpointEndsWithCentral);
        if (region != null) {
          origin = "endpoint";
          if (endpointEndsWithCentral) {
            builder.crossRegionAccessEnabled(true);
            origin = "origin with cross region access";
            LOG.debug("Enabling cross region access for endpoint {}",
                endpointStr);
          }
        }
      }
      
      // No need to override endpoint with "s3.amazonaws.com".
      // Let the client take care of endpoint resolution. Overriding
      // the endpoint with "s3.amazonaws.com" causes 400 Bad Request
      // errors for non-existent buckets and objects.
      // ref: https://github.com/aws/aws-sdk-java-v2/issues/4846
      if (!endpointEndsWithCentral) {
        builder.endpointOverride(endpoint);
        LOG.debug("Setting endpoint to {}", endpoint);
      }
    }

So now:

  1. if endpoint = s3.amazonaws.com and region is null, set to US_EAST_2 and enable cross region, and don't override endpoint.
  2. if endpoint = s3.amazonaw.com and region is set (eg to eu-west-1), set region but do not override endpoint..let SDK figure it out

Copy link
Contributor Author

@virajjasani virajjasani Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for user configured region, for sdk to figure out, we still need to enable cross region access right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise we will have same problem i suppose e.g. bucket on us-west-2 won't be accessible by central endpoint and us-west-1 combination. It will only be accessible by central endpoint and null region combination.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, SDK fill figure out the endpoint even if cross region is not enabled. cross region is only if you don't know the region, so we set a random region and enable it. it doesn't effect endpoint resolution behaviour afaik

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will test this out today. Thanks a lot for the reviews!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anyone should set region=us-west-2 and endpoint = us-west-1 unless they like debugging things.

all we want is to handle situations where things are not set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant "central endpoint" with "us-west-1" region, to access bucket created on us-west-2. I will test out the combination. Thanks

Copy link
Contributor Author

@virajjasani virajjasani Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now:

  1. if endpoint = s3.amazonaws.com and region is null, set to US_EAST_2 and enable cross region, and don't override endpoint.
  2. if endpoint = s3.amazonaw.com and region is set (eg to eu-west-1), set region but do not override endpoint..let SDK figure it out

For case#2, if user sets region to eu-west-1, is it possible that bucket might be on different region? If the answer is yes and if we want to cover that case, we will have to enable cross-region access for endpoint = s3.amazonaws.com, regardless of the region set (null or any particular one).
We should cover this, otherwise we get redirect errors, i will update the PR.

@ahmarsuhail
Copy link
Contributor

for no. 5, > endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)
unable to perform any operation, as expected (no central endpoint, no cross-region access)

you should be able to perform all operations right?

@virajjasani
Copy link
Contributor Author

for no. 5, > endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)

unable to perform any operation, as expected (no central endpoint, no cross-region access)

you should be able to perform all operations right?

It's not central endpoint, and cross region access is also not enabled. Bucket is on us-west-2.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is getting complicated enough that I'm getting a headache. And I am already worrying about how I would field a support call related to this.

See #6466 for some pending VPCE changes too.

  1. the resolution algorithm needs to be written down in the markdown files for people who are not looking at the code to work out.
  2. we need unit tests of all the possible combinations. they don't need create the client, just given an endpoint region combo create the client configuration builder which we can make assertions on.

endpointStr.endsWith(CENTRAL_ENDPOINT);
// No region was configured or the endpoint is central,
// determine the region from the endpoint.
if (region == null || endpointEndsWithCentral) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think anyone should set region=us-west-2 and endpoint = us-west-1 unless they like debugging things.

all we want is to handle situations where things are not set.

describe("Create a client with the central endpoint but also specify region");
Configuration conf = getConfiguration();

S3Client client = createS3Client(conf, CENTRAL_ENDPOINT, US_WEST_2, US_EAST_2, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as in #6466 I'm going to propose we make the static methods accessible and unit tests to validate them, because

  • this stuff is so important and complicated we need it running on every pr
  • everyone's ITest setup is different, so may miss things.

public void testCentralEndpointCrossRegionAccess() throws Throwable {
describe("Create bucket on different region and access it using central endpoint");
final Configuration conf = getConfiguration();
removeBaseAndBucketOverrides(conf, ENDPOINT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what should region be set to here? either unset it or explicitly set it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure will set region here because null region is anyways covered below.

@virajjasani
Copy link
Contributor Author

Tests with bucket created on us-west-2, and endpoint/region configured:

  1. endpoint central and region null
    able to perform all operations, as expected
  2. endpoint central and region set to us-west-1
    able to perform all operations, as expected (because now we ignore region configured for central endpoint)
  3. endpoint central and region us-east-2
    able to perform all operations, as expected
  4. endpoint null and region null
    able to perform all operations, as expected (default region is also us-east-2 with cross-region access)
  5. endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)
    unable to perform any operation, as expected (no central endpoint, no cross-region access)
  6. endpoint s3-us-west-2.amazonaws.com and region us-west-2 (and null)
    able to perform all operations, as expected (no central endpoint, no cross-region access)

Repeated for several tests. FS ops outputs as expected.

@virajjasani
Copy link
Contributor Author

Latest revision involves doc updates + tests that cover wider combination of central endpoint cases, that are expected to run for each PR without worrying about endpoint/region settings.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 8s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 30s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s trunk passed
+1 💚 shadedclient 33m 12s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 28s the patch passed
+1 💚 compile 0m 35s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 35s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 29s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 8s the patch passed
+1 💚 shadedclient 32m 38s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 6s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
124m 28s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/16/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 7fea35d4eef3 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / b1f4df9
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/16/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/16/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

Ran full test suite against endpoints: us-west-2 and s3.amazonaws.com:

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @virajjasani , source looks good to me. just some minor comments about documentation and tests.

describe("Create a client with the central endpoint but also specify region");
Configuration conf = getConfiguration();

S3Client client = createS3Client(conf, CENTRAL_ENDPOINT, US_WEST_2, US_WEST_2, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not clear to me what these different test cases are doing. looks like they call check if you set endpoint to central and also configure a region, it's always the configured region that gets set. do we really need all of them?

using this client. If the bucket is also in eu-west-2, then this will return a successful
response. Otherwise it will throw an error with status code 301 permanently moved. This error
contains the region of the bucket in its header, which we can then use to configure the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of what is in this doc is outdated, it was written when we raised the initial PR for the upgrade to help with reviewing. For example, we don't do the HEAD call to get the region of the bucket. We had to do this when crossRegion was not supported in SDK V2.

Let's create a new section in index.md which explains region handling, something like:

  • fs.s3a.endpoint and fs.s3a.endpoint.region can be used to set values for endpoint and region.
  • If a value is set in fs.s3a.endpoint.region , S3A will configure the S3 Client to use this value. If this is set to a region that does not match your bucket, you will receive a 301 redirect response.
  • If no region is set, but an endpoint is set, S3A will attempt to parse your region from the endpoint.
  • If your endpoint is set to the central s3.amazonaws.com, S3A will enable cross region access. This means that even if the region fs.s3a.endpoint.region is incorrect, the SDK will determine the region. This is done by making the request, and if the SDK receives a 301 redirect, it determines the region at the cost of a HEAD request, and caches it.
  • If no region is no set, and none could be determined from the endpoint, S3A will use US_EAST_2 as the default region and enable cross region access. Again, this means that requests will be not fail, but an initial HEAD request will be made to get the region of your bucket.
  • If the configured region is an empty string, we fallback to using the SDK region resolution chain.

I don't think we need to explain specifically what the code is doing in too much detail, rather just the end behaviour the user will see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • put it in connecting.md, link in index.md as "how to connect"
  • Highlight that it is complex and improving, so may be considered unstable.
  • Link to third_party_stores.html to make clear if you are working with third party stores to look there.

}

@Test
public void testCentralEndpointWithEUWest2Region() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the value in having both testCentralEndpointWithUSWest2Region and testCentralEndpointWithEUWest2Region only comes in when your test bucket is in us-west-2, and then with the eu-west-2 test you check if cross region access behaves as expected. since we can't guarantee where users test against, I don't think we need both tests. Unless we end up using one of the public buckets, as for that we know the region.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go with a public bucket

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public bucket wouldn't work for full CRUD operation? maybe only fs#exists and fs#open followed by some reads would work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that's enough to know if the request is ending up in the right place. is this logic doesn't work properly, even a HEAD will fail. all we really need is a HEAD object I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still wonder, is this not good coverage? basically regardless of where the bucket really is, since we have two tests with overriding region with us-west-2 and eu-west-2, we will ensure that at least once, we have covered the cross region access case with central endpoint at least once. That's the main considerations for central endpoint tests, wdyt? maybe i can still use public bucket as an additional test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated tests: accessing public bucket with central endpoint + same region, and different region, for guaranteed outcome.

LOG.debug("Setting endpoint to {}", endpoint);
} else {
builder.crossRegionAccessEnabled(true);
origin = "origin with cross region access";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"central endpoint with cross region access"

using this client. If the bucket is also in eu-west-2, then this will return a successful
response. Otherwise it will throw an error with status code 301 permanently moved. This error
contains the region of the bucket in its header, which we can then use to configure the client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • put it in connecting.md, link in index.md as "how to connect"
  • Highlight that it is complex and improving, so may be considered unstable.
  • Link to third_party_stores.html to make clear if you are working with third party stores to look there.

}

@Test
public void testCentralEndpointWithEUWest2Region() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go with a public bucket

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 39s trunk passed
+1 💚 compile 0m 45s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 42s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 35s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 17s trunk passed
+1 💚 shadedclient 39m 48s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
-1 ❌ mvninstall 0m 24s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 0m 22s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ javac 0m 22s /patch-compile-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.txt hadoop-aws in the patch failed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04.
-1 ❌ compile 0m 24s /patch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ javac 0m 24s /patch-compile-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 21s /buildtool-patch-checkstyle-hadoop-tools_hadoop-aws.txt The patch fails to run checkstyle in hadoop-aws
+1 💚 mvnsite 0m 52s the patch passed
+1 💚 javadoc 0m 17s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_392-8u392-ga-1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08.
-1 ❌ spotbugs 0m 23s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ shadedclient 10m 31s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 1m 44s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
-1 ❌ asflicense 0m 35s /results-asflicense.txt The patch generated 16 ASF License warnings.
109m 26s
Reason Tests
Failed junit tests hadoop.fs.s3a.commit.staging.TestStagingCommitter
hadoop.fs.s3a.audit.TestHttpReferrerAuditHeader
hadoop.fs.s3a.commit.staging.TestDirectoryCommitterScale
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/18/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 936837698a5d 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 12e503e
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/18/testReport/
Max. process+thread count 628 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/18/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 46m 26s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 33s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 26s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s trunk passed
+1 💚 shadedclient 37m 47s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 21s the patch passed
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 23s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s the patch passed
+1 💚 shadedclient 38m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 55s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
139m 14s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/17/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 75afb51b88c9 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 44760ca
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/17/testReport/
Max. process+thread count 525 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/17/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor changes.

the test suite is going to blow up when testing with third party store credentials; not done for that a while but I think the policy there is "don't worry"


## Connecting to Amazon S3 or a third-party store

See [Connecting to an Amazon S3 Bucket through the S3A Connector](connecting.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that should end with .html (I know, not your change. mine)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


try (FSDataInputStream in = newFS.open(srcFilePath)) {
Assertions
.assertThat(in.read(buffer, 0, 3))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use readFully() and skip the assert. read() can return 0-3 so brittle on slow networks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 55s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 0m 35s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 javadoc 0m 27s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 34s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 7s trunk passed
+1 💚 shadedclient 32m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 29s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 26s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 0m 26s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s the patch passed
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 javadoc 0m 15s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 1m 6s the patch passed
+1 💚 shadedclient 32m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 3m 8s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
123m 7s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/19/artifact/out/Dockerfile
GITHUB PR #6479
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 310c010b7df3 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 09ff933
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/19/testReport/
Max. process+thread count 562 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6479/19/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@ahmarsuhail ahmarsuhail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @virajjasani . +1, LGTM.

@steveloughran steveloughran merged commit d278b34 into apache:trunk Feb 2, 2024
@steveloughran
Copy link
Contributor

merged to trunk; if someone can do a cherrypick and retest to branch-3.4 I'll merge that PR too. For now I want both branches up to date

@virajjasani
Copy link
Contributor Author

branch-3.4 backport PR: #6524

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants