-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8315789: Minor HexFormat performance improvements #15591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back redestad! A progress list of the required criteria for merging this PR into |
Webrevs
|
| if (value < 10) { | ||
| return (char)('0' + value); | ||
| } | ||
| if (digitCase == Case.LOWERCASE) { | ||
| return (char)('a' - 10 + value); | ||
| } | ||
| return (char)('A' - 10 + value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this would adversely impact performance, but what about factoring out these lines in a private method? They are repeated in toHighHexDigit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprisingly this does carry some cost in the microbenchmarks on my M1. Might be noise, but it seems rather consistent so I'll need to investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This:
diff --git a/src/java.base/share/classes/java/util/HexFormat.java b/src/java.base/share/classes/java/util/HexFormat.java
index 107c362cbc2..177548c03f7 100644
--- a/src/java.base/share/classes/java/util/HexFormat.java
+++ b/src/java.base/share/classes/java/util/HexFormat.java
@@ -634,14 +634,7 @@ private static String escapeNL(String string) {
* @return the hexadecimal character for the low 4 bits {@code 0-3} of the value
*/
public char toLowHexDigit(int value) {
- value = value & 0xf;
- if (value < 10) {
- return (char)('0' + value);
- }
- if (digitCase == Case.LOWERCASE) {
- return (char)('a' - 10 + value);
- }
- return (char)('A' - 10 + value);
+ return toHexDigit(value & 0xf);
}
/**
@@ -655,7 +648,10 @@ public char toLowHexDigit(int value) {
* @return the hexadecimal character for the bits {@code 4-7} of the value
*/
public char toHighHexDigit(int value) {
- value = (value >> 4) & 0xf;
+ return toHexDigit((value >> 4) & 0xf);
+ }
+
+ private char toHexDigit(int value) {
if (value < 10) {
return (char)('0' + value);
}.. clearly increase cost across all micros:
Name Cnt Base Error Test Error Unit Diff%
HexFormatBench.appenderLower 15 1,046 ± 0,041 1,301 ± 0,017 us/op -24,4% (p = 0,000*)
HexFormatBench.appenderLowerCached 15 1,056 ± 0,055 1,175 ± 0,115 us/op -11,3% (p = 0,001*)
HexFormatBench.appenderUpper 15 1,059 ± 0,055 1,303 ± 0,012 us/op -23,0% (p = 0,000*)
HexFormatBench.appenderUpperCached 15 1,099 ± 0,014 1,451 ± 0,267 us/op -32,0% (p = 0,000*)
HexFormatBench.toHexLower 15 0,322 ± 0,002 0,338 ± 0,005 us/op -4,8% (p = 0,000*)
HexFormatBench.toHexLowerCached 15 0,324 ± 0,003 0,411 ± 0,005 us/op -27,0% (p = 0,000*)
HexFormatBench.toHexUpper 15 0,324 ± 0,003 0,340 ± 0,003 us/op -4,9% (p = 0,000*)
HexFormatBench.toHexUpperCached 15 0,322 ± 0,001 0,411 ± 0,004 us/op -27,6% (p = 0,000*)
| private final Case digitCase; | ||
|
|
||
| private enum Case { | ||
| LOWERCASE, | ||
| UPPERCASE | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of an enum cache the "A" or "a" for direct use.
private final char caseBase;
Initialize it in the constructor.
| if (digitCase == Case.LOWERCASE) { | ||
| return (char)('a' - 10 + value); | ||
| } | ||
| return (char)('A' - 10 + value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would caching the upper/lower case base avoid a branch?
| if (digitCase == Case.LOWERCASE) { | |
| return (char)('a' - 10 + value); | |
| } | |
| return (char)('A' - 10 + value); | |
| return (char)(caseBase - 10 + value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this but it looks like it is marginally slower - plausibly the code I have means the JIT eliminates the untaken branch and constant folds this neatly. I'll do some digging..
|
You can refer to the PR #14745 I submitted. Using a table with a length of 256 in HexDigits, you can get two digits in one lookup table, and use ByteArrayLittleEndian.setShort to write two digits at a time. like this: public String toHexDigits(byte value) {
byte[] rep = new byte[2];
ByteArrayLittleEndian.setShort(rep, 0, HexDigits.DIGITS[value & 0xff]); // DIGITS shuld changed to little-endian
try {
return jla.newStringNoRepl(rep, StandardCharsets.ISO_8859_1);
} catch (CharacterCodingException cce) {
throw new AssertionError(cce);
}
} |
|
I took I'm not sure 4% is a large enough win on a micro to motivate the lookup-table based approach. I'd rather see us investigate if we could consolidate these overlapping utility classes ( |
|
Should we test the performance comparison of toHexDigits(long)? I've done some work before, and I didn't find a way to perform better than a lookup table without using a lookup table. |
|
I'm not sure that micro-benchmarks are very indicative on whether a lookup table performs better than short and straightforward code. |
|
As @rgiulietti says lookup-table algorithms may outperform in microbenchmarks but lose out in real world scenarios, so we need to stay clear unless there's major benefit. And as it turns out, the relative benefit seem to come mainly from the use of This gets the same speed-up (4%) as calling |
|
I also tried variants of this for and when doing it all in one go (this code is Only a win on This is indicative that any win here comes from tickling the JIT the right way, rather than some intrinsic property of |
In the HexFormat scenario, if the length of the input byte[] is larger, the performance of using the lookup table will be better. If the length of byte[] is greater than 1 in most scenarios, using lookup table will have better performance. |
|
I ran some experiments with a lookup-table approach (based on I prefer the simplicity of this PR as it stands and think we should backtrack on some of the lookup tables we've recently added in |
RogerRiggs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This non-lookup version looks fine; it won't invalidate caches when used and the code is easy to understand. Thanks for the cleanup and re-checking the performance benefits.
|
@cl4es This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 219 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
/integrate |
|
Going to push as commit 92ad4a2.
Your commit was automatically rebased without conflicts. |
This PR seeks to improve formatting of hex digits using
java.util.HexFormatsomewhat.This is achieved getting rid of a couple of lookup tables, caching the result of
HexFormat.of().withUpperCase(), and removing tiny allocation that happens in theformatHex(A, byte)method. Improvements range from 20-40% on throughput, and some operations allocate less:Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15591/head:pull/15591$ git checkout pull/15591Update a local copy of the PR:
$ git checkout pull/15591$ git pull https://git.openjdk.org/jdk.git pull/15591/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 15591View PR using the GUI difftool:
$ git pr show -t 15591Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15591.diff
Webrev
Link to Webrev Comment