Skip to content

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented May 2, 2024

HADOOP-19161

Initial design

  • no tests or docs
  • served up via StoreContext. Not sure about the merits of that I think it is needed so it gets down to all AbstractStoreOperation instances, but should that be where the decision is made?
  • create performance is wired up.
  • as is path capabilities

For testing we need to make sure ths is unset from all cost tests.

relates to #6543; the logic to set up that operation is here...that PR would
just be the implementation.

Same for a delete optimisation where we'd skip parent dir probe.
rename could do the same for its source dir too.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@steveloughran
Copy link
Contributor Author

note the commented out bit where we considered adding options like "hive" or "spark".

@HarshitGupta11 and I discussed this; for now lets go with a list of options and "*"

Comment on lines 151 to 154
/* case "hive":
case "impala":
case "spark":
case "distcp":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not let downstreamers decide what flags they want (after extensive testing)? And across different releases, they might need different flags to be turned on (in case of any regression)?

We can just recommend the flags (as already commented out here) but not set the flags for them. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

harshit and I were discussing this. i think it's best to have that option list, as app settings could be too brittle to changes

Comment on lines 76 to 78
public boolean isDelete() {
return delete;
}
Copy link
Contributor

@virajjasani virajjasani May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one also we want to tackle as separate task (after HADOOP-19072), correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. harshit did an experiment where he turned off all attempts at creating parent dirs after delete. fairly brittle, i think

@steveloughran
Copy link
Contributor Author

I have a better design for this. changign this to draft.

Proposed: we have a Configuration.getEnumOptions(Enum x, boolean failIfUnknown) which returns an EnumSet of all values of the enum class whose valueOf() matches an entry in the CSV list (with some mapping such as case conversion, and map - and . to "_".

this makes it trivial to reuse/process. the implementation would be outside the actual Configuration class to make it easy for AbfsConfiguration to use too

@steveloughran steveloughran marked this pull request as draft May 13, 2024 17:26
@steveloughran steveloughran force-pushed the s3/HADOOP-19161-performance-flags branch from feb384f to 28cd50a Compare May 15, 2024 18:27
@steveloughran
Copy link
Contributor Author

Reason: Use hadoop-common provided Sets rather than Guava provided Sets
	in file: org/apache/hadoop/util/ConfigurationUtil.java
		org.apache.hadoop.thirdparty.com.google.common.collect.Sets 	(Line: 33, Matched by: org.apache.hadoop.thirdparty.com.google.common.collect.Sets)

@steveloughran steveloughran force-pushed the s3/HADOOP-19161-performance-flags branch from 28cd50a to f3571ac Compare May 23, 2024 14:13
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@apache apache deleted a comment from hadoop-yetus May 29, 2024
@steveloughran steveloughran force-pushed the s3/HADOOP-19161-performance-flags branch from a564aa4 to 82974c4 Compare May 29, 2024 12:51
@apache apache deleted a comment from hadoop-yetus Jun 1, 2024
@steveloughran steveloughran marked this pull request as ready for review June 4, 2024 10:36
@steveloughran
Copy link
Contributor Author

@mukund-thakur @saxenapranav @HarshitGupta11 @ahmarsuhail @virajjasani

FYI, this is ready to review. Once in, viraj's mkdirs option can go in, and then we can document this option in performance.md

ABFS can use it for their options

@steveloughran steveloughran force-pushed the s3/HADOOP-19161-performance-flags branch 2 times, most recently from 206bfab to af2dd8d Compare June 10, 2024 18:50
@steveloughran
Copy link
Contributor Author

OK, I'm happy with this.
@saxenapranav can you look at this? this flagset is what I hope to use for a similar set of options in abfs -which is why the parser lives outside of Configuration.

@mukund-thakur @virajjasani thoughts?

@virajjasani
Copy link
Contributor

Sounds good, i would take a detailed look at ConfigurationHelper and FlagSet soon.

@steveloughran
Copy link
Contributor Author

realised we need one more thing: a way to enumerate all enabled flags.
This is to make the bucketInfo list of flags easy to self-generate

@steveloughran steveloughran force-pushed the s3/HADOOP-19161-performance-flags branch from d9cb6f9 to 64319c7 Compare June 20, 2024 20:04
Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few nits, overall changes look solid

Comment on lines +91 to +111
intercept(IllegalArgumentException.class, "unrecognized", () ->
parseEnumSet("key", "c, unrecognized", SimpleEnum.class, false));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about intercepting the util method call instead of direct call to parseEnumSet:
intercept(IllegalArgumentException.class, "unrecognized", () -> assertEnumParse("c, unrecognized", SimpleEnum.class, false))?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me do that as a separate test

sb.append(", partSize=").append(partSize);
sb.append(", enableMultiObjectsDelete=").append(enableMultiObjectsDelete);
sb.append(", maxKeys=").append(maxKeys);
sb.append(", ").append(performanceFlags);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

key performanceFlags= missing

Comment on lines +104 to +106
public boolean enabled(final E flag) {
return flags.contains(flag);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to how we check for mutability of flags while enabling/disabling a flag, i wonder if we should check for immutability while checking for enabled()? Probably not, but then how would client know when to set immutable? or maybe it's fine to not set immutability ever?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need it immutable, probably it's good case for builder or constructor so maybe keeping the object mutable is also fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i started off fully mutable, added immutabilty as an option to avoid all sync issues. you can manipulated it as much as you want and only once made immutable do you need to worry about thread safety etc. lets you buld by copying and updating, from a default value with changes etc

@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@apache apache deleted a comment from hadoop-yetus Jul 8, 2024
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 44s Maven dependency ordering for branch
+1 💚 mvninstall 32m 21s trunk passed
+1 💚 compile 17m 21s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 16m 7s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 4m 30s trunk passed
+1 💚 mvnsite 2m 41s trunk passed
+1 💚 javadoc 1m 57s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 44s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 3m 57s trunk passed
+1 💚 shadedclient 34m 20s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 16m 42s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 16m 42s the patch passed
+1 💚 compile 15m 54s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 15m 54s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 25s /results-checkstyle-root.txt root: The patch generated 2 new + 116 unchanged - 0 fixed = 118 total (was 116)
+1 💚 mvnsite 2m 40s the patch passed
+1 💚 javadoc 1m 51s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 45s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 4m 20s the patch passed
+1 💚 shadedclient 34m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 48s hadoop-common in the patch passed.
+1 💚 unit 2m 53s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 6s The patch does not generate ASF License warnings.
242m 55s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/14/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 937d04fdd13c 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 4b5f985
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/14/testReport/
Max. process+thread count 3152 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/14/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 17m 39s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 19s Maven dependency ordering for branch
+1 💚 mvninstall 37m 35s trunk passed
+1 💚 compile 18m 42s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 17m 47s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 4m 36s trunk passed
+1 💚 mvnsite 2m 33s trunk passed
+1 💚 javadoc 1m 48s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 36s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 3m 45s trunk passed
+1 💚 shadedclient 39m 47s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 31s Maven dependency ordering for patch
+1 💚 mvninstall 1m 25s the patch passed
+1 💚 compile 18m 50s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 18m 50s the patch passed
+1 💚 compile 18m 10s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 18m 10s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 5m 22s /results-checkstyle-root.txt root: The patch generated 2 new + 116 unchanged - 0 fixed = 118 total (was 116)
+1 💚 mvnsite 2m 43s the patch passed
+1 💚 javadoc 1m 45s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 43s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 4m 43s the patch passed
+1 💚 shadedclient 43m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 20m 4s hadoop-common in the patch passed.
+1 💚 unit 2m 43s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 59s The patch does not generate ASF License warnings.
285m 41s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/15/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 8c687f3b19c4 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 4b5f985
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/15/testReport/
Max. process+thread count 3137 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/15/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java:997:        .setPerformanceFlags(FlagSet.createFlagSet(PerformanceFlagEnum.class,FS_S3A_PERFORMANCE_FLAGS)): Line is longer than 100 characters (found 103). [LineLength]
./hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestUtils.java:997:        .setPerformanceFlags(FlagSet.createFlagSet(PerformanceFlagEnum.class,FS_S3A_PERFORMANCE_FLAGS)):77: ',' is not followed by whitespace. [WhitespaceAfter]

Change-Id: Ie28b4ab01881cfe92fb622d4d27b7bdad8a690fd
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 57s Maven dependency ordering for branch
+1 💚 mvninstall 32m 24s trunk passed
+1 💚 compile 17m 24s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 16m 10s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 4m 28s trunk passed
+1 💚 mvnsite 2m 42s trunk passed
+1 💚 javadoc 1m 56s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 44s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 3m 53s trunk passed
+1 💚 shadedclient 34m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 1m 28s the patch passed
+1 💚 compile 16m 55s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 16m 55s the patch passed
+1 💚 compile 16m 9s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 16m 9s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 29s /results-checkstyle-root.txt root: The patch generated 1 new + 116 unchanged - 0 fixed = 117 total (was 116)
+1 💚 mvnsite 2m 38s the patch passed
+1 💚 javadoc 1m 51s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 46s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 4m 15s the patch passed
+1 💚 shadedclient 34m 21s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 49s hadoop-common in the patch passed.
+1 💚 unit 2m 55s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 5s The patch does not generate ASF License warnings.
242m 57s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/16/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 7a1904c6c487 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c216890
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/16/testReport/
Max. process+thread count 1379 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/16/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Change-Id: Ifa8941acbcc84be30a997b6097947c302158f583
Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like the design and implementation. Just need one clarification about the immutability comment

* trim the list, map to enum values in the message (case insensitive)
* and return the set.
* Special handling of "*" meaning: all values.
* @param key key for error messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't see much use of key param.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay I get it now. maybe mention - configuration key which was used to configure the flags.
I got confused initially because of the UT,

}

@Test
public void testStarEnum() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is already present above testEnumParseAll

}

@Test
public void testUnknownStarEnum() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a test with repeated values "a, b, a". should pass

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 56s Maven dependency ordering for branch
+1 💚 mvninstall 32m 9s trunk passed
+1 💚 compile 16m 50s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 15m 32s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 4m 17s trunk passed
+1 💚 mvnsite 2m 32s trunk passed
+1 💚 javadoc 1m 48s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 3m 46s trunk passed
+1 💚 shadedclient 34m 18s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for patch
+1 💚 mvninstall 1m 22s the patch passed
+1 💚 compile 16m 21s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 16m 21s the patch passed
+1 💚 compile 15m 23s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 15m 23s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 14s the patch passed
+1 💚 mvnsite 2m 30s the patch passed
+1 💚 javadoc 1m 43s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 38s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 4m 9s the patch passed
+1 💚 shadedclient 33m 59s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 22s hadoop-common in the patch passed.
+1 💚 unit 2m 49s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
236m 55s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/17/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 7722d8fce459 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / d55ffeb
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/17/testReport/
Max. process+thread count 1309 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/17/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

@mukund-thakur thanks, altered the tests and tried to clarify the immutability.

One final thought: should we have a default list of enum values to set if, when parsing the config string, the string isn't found?

Currently if nothing is found we return the empty set.

* Mukund's feedback
* Remove unused "throws Throwable" on test cases as appropriate

Save enumClass constructor parameter of FlagSet
* add accessor
* compare in equals
* use in a copy() operation which creates a mutable deep copy
  of an object.
* explicit checks for null enumclass, prefix in constructor

Change-Id: I8080d772bfd301ab6e3ed802d54a060a63feea78
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 29s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 31s Maven dependency ordering for branch
+1 💚 mvninstall 32m 9s trunk passed
+1 💚 compile 16m 59s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 15m 48s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 4m 17s trunk passed
+1 💚 mvnsite 2m 30s trunk passed
+1 💚 javadoc 1m 48s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 37s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 3m 46s trunk passed
+1 💚 shadedclient 33m 54s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for patch
+1 💚 mvninstall 1m 23s the patch passed
+1 💚 compile 16m 19s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 16m 19s the patch passed
+1 💚 compile 15m 25s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 15m 25s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 4m 6s the patch passed
+1 💚 mvnsite 2m 30s the patch passed
+1 💚 javadoc 1m 45s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 34s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 4m 8s the patch passed
+1 💚 shadedclient 34m 30s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 19s hadoop-common in the patch passed.
+1 💚 unit 2m 48s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
236m 56s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/18/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 4c294b2bd8b6 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f330597
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/18/testReport/
Max. process+thread count 2014 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/18/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@mukund-thakur
Copy link
Contributor

@mukund-thakur thanks, altered the tests and tried to clarify the immutability.

One final thought: should we have a default list of enum values to set if, when parsing the config string, the string isn't found?

Currently if nothing is found we return the empty set.

I am good with an empty set. What is the default value you are thinking? create is already getting added based on the create perf flag explicitly.

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nits, looks good to go

*/
private void assertHasCapability(final String capability) {
Assertions.assertThat(flagSet.hasCapability(capability))
.describedAs("Capabiilty of %s on %s", capability, flagSet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Capability

*/
private void assertLacksCapability(final String capability) {
Assertions.assertThat(flagSet.hasCapability(capability))
.describedAs("Capabiilty of %s on %s", capability, flagSet)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here


| *Option* | *Meaning* | Since |
|----------|--------------------|:------|
| `create` | Create Performance | 3.4.1 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we mention 3.4.1 / 3.5.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no

Comment on lines +290 to +291
That is a complicated list of options which deliver speed if the person setting them
understands the risks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just in case if it looks better? if the person or client application setting them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, its an individual who takes the blame

Change-Id: Id49ceae8badaaa80e35dbe62c75a7d6d107449c1
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 19s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 11 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 16s Maven dependency ordering for branch
+1 💚 mvninstall 20m 10s trunk passed
+1 💚 compile 9m 0s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 compile 8m 5s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 checkstyle 2m 6s trunk passed
+1 💚 mvnsite 1m 35s trunk passed
+1 💚 javadoc 1m 13s trunk passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 9s trunk passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 2m 27s trunk passed
+1 💚 shadedclient 21m 8s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 20s Maven dependency ordering for patch
+1 💚 mvninstall 0m 51s the patch passed
+1 💚 compile 8m 35s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javac 8m 35s the patch passed
+1 💚 compile 8m 7s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 javac 8m 7s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 4s the patch passed
+1 💚 mvnsite 1m 36s the patch passed
+1 💚 javadoc 1m 12s the patch passed with JDK Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2
+1 💚 javadoc 1m 9s the patch passed with JDK Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
+1 💚 spotbugs 2m 35s the patch passed
+1 💚 shadedclient 20m 46s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 37s hadoop-common in the patch passed.
+1 💚 unit 2m 10s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
150m 45s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/19/artifact/out/Dockerfile
GITHUB PR #6789
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets markdownlint
uname Linux 0378336d9b15 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ce8dac0
Default Java Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.23+9-post-Ubuntu-1ubuntu120.04.2 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_412-8u412-ga-1~20.04.1-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/19/testReport/
Max. process+thread count 1282 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6789/19/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

ok, everything is addressed, going to merge to the branches

@steveloughran steveloughran merged commit a5806a9 into apache:trunk Jul 29, 2024
steveloughran added a commit to steveloughran/hadoop that referenced this pull request Jul 29, 2024
…performance flags (apache#6789)



1. Configuration adds new method `getEnumSet()` to get a set of enum values from
   a configuration string.
   <E extends Enum<E>> EnumSet<E> getEnumSet(String key, Class<E> enumClass, boolean ignoreUnknown)

   Whitespace is ignored, case is ignored and the value "*" is mapped to "all values of the enum".
   If "ignoreUnknown" is true then when parsing, unknown values are ignored.
   This is recommended for forward compatiblity with later versions.

2. This support is implemented in org.apache.hadoop.fs.s3a.impl.ConfigurationHelper -it can be used
    elsewhere in the hadoop codebase.

3. A new private FlagSet class in hadoop common manages a set of enum flags.

     It implements StreamCapabilities and can be probed for a specific option being set
    (with a prefix)


S3A adds an option fs.s3a.performance.flags which builds a FlagSet with enum
type PerformanceFlagEnum

* which initially contains {Create, Delete, Mkdir, Open}
* the existing fs.s3a.create.performance option sets the flag "Create".
* tests which configure fs.s3a.create.performance MUST clear
  fs.s3a.performance.flags in test setup.

Future performance flags are planned, with different levels of safety
and/or backwards compatibility.

Contributed by Steve Loughran
steveloughran added a commit that referenced this pull request Jul 29, 2024
…performance flags (#6789) (#6966)


1. Configuration adds new method `getEnumSet()` to get a set of enum values from
   a configuration string.
   <E extends Enum<E>> EnumSet<E> getEnumSet(String key, Class<E> enumClass, boolean ignoreUnknown)

   Whitespace is ignored, case is ignored and the value "*" is mapped to "all values of the enum".
   If "ignoreUnknown" is true then when parsing, unknown values are ignored.
   This is recommended for forward compatiblity with later versions.

2. This support is implemented in org.apache.hadoop.fs.s3a.impl.ConfigurationHelper -it can be used
    elsewhere in the hadoop codebase.

3. A new private FlagSet class in hadoop common manages a set of enum flags.

     It implements StreamCapabilities and can be probed for a specific option being set
    (with a prefix)


S3A adds an option fs.s3a.performance.flags which builds a FlagSet with enum
type PerformanceFlagEnum

* which initially contains {Create, Delete, Mkdir, Open}
* the existing fs.s3a.create.performance option sets the flag "Create".
* tests which configure fs.s3a.create.performance MUST clear
  fs.s3a.performance.flags in test setup.

Future performance flags are planned, with different levels of safety
and/or backwards compatibility.

Contributed by Steve Loughran
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
…performance flags (apache#6789)



1. Configuration adds new method `getEnumSet()` to get a set of enum values from
   a configuration string.
   <E extends Enum<E>> EnumSet<E> getEnumSet(String key, Class<E> enumClass, boolean ignoreUnknown)

   Whitespace is ignored, case is ignored and the value "*" is mapped to "all values of the enum".
   If "ignoreUnknown" is true then when parsing, unknown values are ignored.
   This is recommended for forward compatiblity with later versions.

2. This support is implemented in org.apache.hadoop.fs.s3a.impl.ConfigurationHelper -it can be used
    elsewhere in the hadoop codebase.

3. A new private FlagSet class in hadoop common manages a set of enum flags.

     It implements StreamCapabilities and can be probed for a specific option being set
    (with a prefix)


S3A adds an option fs.s3a.performance.flags which builds a FlagSet with enum
type PerformanceFlagEnum

* which initially contains {Create, Delete, Mkdir, Open}
* the existing fs.s3a.create.performance option sets the flag "Create".
* tests which configure fs.s3a.create.performance MUST clear
  fs.s3a.performance.flags in test setup.

Future performance flags are planned, with different levels of safety
and/or backwards compatibility.

Contributed by Steve Loughran
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
…performance flags (apache#6789)



1. Configuration adds new method `getEnumSet()` to get a set of enum values from
   a configuration string.
   <E extends Enum<E>> EnumSet<E> getEnumSet(String key, Class<E> enumClass, boolean ignoreUnknown)

   Whitespace is ignored, case is ignored and the value "*" is mapped to "all values of the enum".
   If "ignoreUnknown" is true then when parsing, unknown values are ignored.
   This is recommended for forward compatiblity with later versions.

2. This support is implemented in org.apache.hadoop.fs.s3a.impl.ConfigurationHelper -it can be used
    elsewhere in the hadoop codebase.

3. A new private FlagSet class in hadoop common manages a set of enum flags.

     It implements StreamCapabilities and can be probed for a specific option being set
    (with a prefix)


S3A adds an option fs.s3a.performance.flags which builds a FlagSet with enum
type PerformanceFlagEnum

* which initially contains {Create, Delete, Mkdir, Open}
* the existing fs.s3a.create.performance option sets the flag "Create".
* tests which configure fs.s3a.create.performance MUST clear
  fs.s3a.performance.flags in test setup.

Future performance flags are planned, with different levels of safety
and/or backwards compatibility.

Contributed by Steve Loughran
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants