Skip to content

Conversation

@jimmy-zuber-amzn
Copy link
Contributor

S3AFS does not support symlinks, so attempting to resolve
symlinks in globStatus causes wasted S3 calls and worse
performance. Removing it will speed up some calls to
globStatus.

JIRA link: https://issues.apache.org/jira/browse/HADOOP-17105

@jimmy-zuber-amzn
Copy link
Contributor Author

Looks like the javadoc failures would be fixed by -> #2094

S3AFS does not support symlinks, so attempting to resolve
symlinks in globStatus causes wasted S3 calls and worse
performance. Removing it will speed up some calls to
globStatus.

JIRA link: https://issues.apache.org/jira/browse/HADOOP-17105
@jimmy-zuber-amzn
Copy link
Contributor Author

Revision 2: Fixed checkstyle warnings on line length

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 20m 16s trunk passed
+1 💚 compile 0m 40s trunk passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 compile 0m 36s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 checkstyle 0m 27s trunk passed
+1 💚 mvnsite 0m 41s trunk passed
+1 💚 shadedclient 14m 50s branch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 34s hadoop-aws in trunk failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 28s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+0 🆗 spotbugs 0m 58s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 57s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 36s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 javac 0m 29s the patch passed
+1 💚 checkstyle 0m 20s the patch passed
+1 💚 mvnsite 0m 33s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 13m 48s patch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 28s hadoop-aws in the patch failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 findbugs 1m 2s the patch passed
_ Other Tests _
+1 💚 unit 1m 21s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
61m 59s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2113/2/artifact/out/Dockerfile
GITHUB PR #2113
JIRA Issue HADOOP-17105
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux d9dfb535b6b2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 04abd0e
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_252-8u252-b09-1~18.04-b09
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2113/2/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2113/2/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2113/2/testReport/
Max. process+thread count 457 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2113/2/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran self-requested a review July 3, 2020 10:15
@apache apache deleted a comment from hadoop-yetus Jul 3, 2020
@steveloughran
Copy link
Contributor

patch LGTM. Which endpoint (e.g us-west-2) and what build CLI options did you use?

we don't need that much detail, though if tests are failing that's good to call out so you can get some assistance debugging. e.g

#2076 (comment)

@steveloughran
Copy link
Contributor

oh, and +1 pending that s3 endpoint declaration.

}

@Test
public void testCostOfGlobStatusNoSymlinkResolution() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both these tests look almost the same in terms operation counts and also the symlink resolution is always disabled.
Can you please tell me why do we need two tests here? Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these tests two different things, a directory with multiple objects and a directory with one object. The directory with a single object is the special case that triggers attempted symlink resolution, so I wanted to carve out that special case in its own test.

The multiple-objects-in-a-directory test is a general test that it felt like globStatus should have, whereas the second one was specifically made to catch the regression.

If you don't think that the multiple objects test is justified, I can remove it. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimmy-zuber-amzn
Copy link
Contributor Author

patch LGTM. Which endpoint (e.g us-west-2) and what build CLI options did you use?

we don't need that much detail, though if tests are failing that's good to call out so you can get some assistance debugging. e.g

#2076 (comment)

I ran with the following test settings, no tests failed by the way:

  • Region: us-west-2
  • Commands
    • w/ S3Guard: mvn -T 1C verify -Dparallel-tests -DtestsThreadCount=6 -Ds3guard -Ddynamo -Dauth
    • without: mvn clean verify -Dparallel-tests

@steveloughran steveloughran merged commit 806d84b into apache:trunk Jul 13, 2020
@steveloughran
Copy link
Contributor

+1 -merged to trunk. nice catch

asfgit pushed a commit that referenced this pull request Jul 13, 2020
#2113)

Contributed by Jimmy Zuber.

Change-Id: I2f247c2d2ab4f38214073e55f5cfbaa15aeaeb11
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
apache#2113)

Contributed by Jimmy Zuber.

Change-Id: Iad87ee5f9dd0e88af96088f592f22fde8768beb5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants