-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8315585: Optimization for decimal to string #16006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into |
|
/label remove security |
|
@wenshao |
|
@jddarcy, could you please review my code? Thank you! |
|
@cl4es could you please review my code? Thank you! |
Webrevs
|
|
Based on @liach 's review suggestions, I reduced the changes to JavaLangAccess and further removed the duplicate code in BigDecimal. |
cl4es
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really qualified to review the floating point code. Simplifying away offset and getting rid of the StringBuilderHelper all seem like good improvements, though I think it'd be good if we could either avoid or split out the JLA changes.
| return (Integer.toString(highInt) + '.' + | ||
| StringBuilderHelper.DIGIT_TENS[lowInt] + | ||
| StringBuilderHelper.DIGIT_ONES[lowInt]) ; | ||
| int highIntSize = JLA.stringSize(highInt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which micros cover performance of this branch?
What performance would you get if you simplified this to, say, return highInt + (lowInt < 10 ? ".0" : ".") + lowInt;?
lowInt is currently unused since you use (int) intCompact - highInt * 100 instead below. (While a clever optimization in theory, I believe the JIT should handle a pair of integer modulo and division operations (a = intValue % 100; b = intValue / 100) so that it only has to do one division, so please measure that whatever you do here has a significant benefit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performance numbers run on MacBook M1 Max:
-Benchmark Mode Cnt Score Error Units
-BigDecimals.testHugeToEngineeringString avgt 15 217.619 ? 4.617 ns/op
-BigDecimals.testLargeToEngineeringString avgt 15 65.183 ? 4.368 ns/op
-BigDecimals.testSmallToEngineeringString avgt 15 17.084 ? 0.056 ns/op
-BigDecimals.testToEngineeringString avgt 15 1856.849 ? 63.571 ns/op
+Benchmark Mode Cnt Score Error Units (eec631c)
+BigDecimals.testHugeToEngineeringString avgt 15 164.910 ? 1.514 ns/op (+31.97)
+BigDecimals.testLargeToEngineeringString avgt 15 20.826 ? 0.066 ns/op (+212.99)
+BigDecimals.testSmallToEngineeringString avgt 15 11.987 ? 0.080 ns/op (+42.53)
+BigDecimals.testToEngineeringString avgt 15 1854.372 ? 63.810 ns/op (+0.14)The microben covered by fast-path is BigDecimals.testSmallToEngineeringString, highInt + (lowInt < 10 ? ".0" : ".") + lowInt is actually implemented based on StringBuilder and will be much slower than the current baseline.
Benchmark Mode Cnt Score Error Units (use StringConcat)
BigDecimals.testSmallToEngineeringString avgt 15 41.876 ? 0.182 ns/op
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested it again and found that the remainder is not slow. This was my wrong impression. I have changed it according to your suggestion.
Keep the duplicate code of StringConcatHelper, or use JLA, or move the code of getCharsLatin1 & stringSize to DecimalDigits (PR #15699). Of the three options, using JLA is the smallest change. |
Adding methods to JLA also adds a maintenance burden by exposing methods and internal details that we prefer not to expose. Explicit diff --git a/src/java.base/share/classes/java/math/BigDecimal.java b/src/java.base/share/classes/java/math/BigDecimal.java
index 3163b2ffadb..b4edbb131d9 100644
--- a/src/java.base/share/classes/java/math/BigDecimal.java
+++ b/src/java.base/share/classes/java/math/BigDecimal.java
@@ -30,6 +30,12 @@
package java.math;
import static java.math.BigInteger.LONG_MASK;
+
+import java.lang.invoke.CallSite;
+import java.lang.invoke.MethodHandle;
+import java.lang.invoke.MethodHandles;
+import java.lang.invoke.MethodType;
+import java.lang.invoke.StringConcatFactory;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.StandardCharsets;
import java.io.IOException;
@@ -4146,6 +4152,23 @@ public BigDecimal ulp() {
return BigDecimal.valueOf(1, this.scale(), 1);
}
+ private static class ConcatHelper {
+ private static final MethodHandle INT_DOT_CHAR_CHAR;
+
+ static {
+ try {
+ CallSite site = StringConcatFactory.makeConcatWithConstants(
+ MethodHandles.lookup(),
+ "scale2",
+ MethodType.methodType(String.class, int.class, char.class, char.class),
+ "\u0001.\u0001\u0001");
+ INT_DOT_CHAR_CHAR = site.dynamicInvoker();
+ } catch (Exception e) {
+ throw new Error("Bootstrap error", e);
+ }
+ }
+ }
+
/**
* Lay out this {@code BigDecimal} into a {@code char[]} array.
* The Java 1.2 equivalent to this was called {@code getValueString}.
@@ -4164,18 +4187,12 @@ private String layoutChars(boolean sci) {
intCompact >= 0 && intCompact < Integer.MAX_VALUE) {
// currency fast path
int highInt = (int)intCompact / 100;
- int highIntSize = JLA.stringSize(highInt);
- byte[] buf = new byte[highIntSize + 3];
- JLA.getCharsLatin1(highInt, highIntSize, buf);
- buf[highIntSize] = '.';
- DecimalDigits.writeDigitPairLatin1(
- buf,
- highIntSize + 1,
- (int) intCompact % 100);
+ int lowInt = (int)intCompact - highInt * 100;
+ char pair = (char)DecimalDigits.digitPair(lowInt);
try {
- return JLA.newStringNoRepl(buf, StandardCharsets.ISO_8859_1);
- } catch (CharacterCodingException e) {
- throw new AssertionError(e);
+ return (String)ConcatHelper.INT_DOT_CHAR_CHAR.invokeExact(highInt, (char)(pair & 0xff), (char)(pair >> 8));
+ } catch (Throwable e) {
+ throw new RuntimeException(e);
}
}This is equivalent to |
|
Yes, explicitly calling into It might also be possible to evolve the decision to compile all of java.base with |
|
/reviewers 2 reviewer |
cl4es
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for simplifying. It seems that even regular StringBuilder-based concat gives numbers that are competitive with the baseline this way. I'll file an RFE to examine if we can enable ISC for this.
I've requested another reviewer for this, since I don't have the sufficient floating point domain expertise to review these changes alone.
Once both #16244 and this has been integrated I want to revisit this. The effect of changing |
|
@cl4es I found that StringConcatFactory.makeConcatWithConstants will become slower when using recipe("\1.\1", long.class, long.class). If we treat the prefix of length 1 as char, it will become faster. In this branch https://github.com/wenshao/jdk/tree/optim_decimal_to_string_x2 , if we remove the code that treats the prefix of length 1 as char in the preparer(String prefix, Class<?> cl) method, it will become slower. package java.lang.invoke;
class StringConcatFactory {
static MethodHandle prepender(String prefix, Class<?> cl) {
if (prefix == null || prefix.isEmpty()) {
return noPrefixPrepender(cl);
} else {
// remove this will slower
if (prefix.length() == 1
&& (cl == char.class || cl == int.class || cl == long.class)) {
char ch = prefix.charAt(0);
MethodHandle prepend = JLA.stringConcatHelper(
"prepend",
methodType(long.class, long.class, byte[].class, cl, char.class)).rebind();
return MethodHandles.insertArguments(prepend, 3, ch);
}
return MethodHandles.insertArguments(
prepender(cl), 3, prefix);
}
}
}Here are the performance numbers running on the MaxBook M1 Max: -Benchmark Mode Cnt Score Error Units (String prefix)
-BigDecimals.smallScale3ToPlainString avgt 15 28.898 ? 0.527 ns/op
+Benchmark Mode Cnt Score Error Units (char prefix)
+BigDecimals.smallScale3ToPlainString avgt 15 23.182 ? 0.477 ns/opMaybe you should submit another PR to improve StringConcatFactory.makeConcatWithConstants |
|
@rgiulietti It is currently a stable version, please help me start the review. Here are the performance numbers running on the MaxBook M1 Pro: -Benchmark Mode Cnt Score Error Units (baseline)
-BigDecimals.hugeEngineeringToString avgt 15 199.683 ? 7.457 ns/op
-BigDecimals.hugeLayoutCharsToString avgt 15 205.366 ? 2.673 ns/op
-BigDecimals.hugePlainToString avgt 15 219.066 ? 3.334 ns/op
-BigDecimals.largeScale2EngineeringToString avgt 15 55.303 ? 0.606 ns/op
-BigDecimals.largeScale2LayoutCharsToString avgt 15 55.284 ? 0.512 ns/op
-BigDecimals.largeScale2PlainToString avgt 15 22.836 ? 0.128 ns/op
-BigDecimals.largeScale3EngineeringToString avgt 15 57.332 ? 0.851 ns/op
-BigDecimals.largeScale3LayoutCharsToString avgt 15 57.535 ? 1.221 ns/op
-BigDecimals.largeScale3PlainToString avgt 15 23.967 ? 0.055 ns/op
-BigDecimals.smallScale2EngineeringToString avgt 15 17.063 ? 0.444 ns/op
-BigDecimals.smallScale2LayoutCharsToString avgt 15 16.999 ? 0.072 ns/op
-BigDecimals.smallScale2PlainToString avgt 15 24.069 ? 0.511 ns/op
-BigDecimals.smallScale3EngineeringToString avgt 15 53.697 ? 0.872 ns/op
-BigDecimals.smallScale3LayoutCharsToString avgt 15 53.754 ? 1.990 ns/op
-BigDecimals.smallScale3PlainToString avgt 15 24.596 ? 0.100 ns/op
+Benchmark Mode Cnt Score Error Units (51b1041)
+BigDecimals.hugeEngineeringToString avgt 15 194.288 ? 7.294 ns/op (+2.78)
+BigDecimals.hugeLayoutCharsToString avgt 15 192.121 ? 2.079 ns/op (+6.90)
+BigDecimals.hugePlainToString avgt 15 201.692 ? 4.527 ns/op (+8.62)
+BigDecimals.largeScale2EngineeringToString avgt 15 20.128 ? 0.266 ns/op (+174.76)
+BigDecimals.largeScale2LayoutCharsToString avgt 15 20.681 ? 0.508 ns/op (+167.32)
+BigDecimals.largeScale2PlainToString avgt 15 19.950 ? 0.273 ns/op (+14.47)
+BigDecimals.largeScale3EngineeringToString avgt 15 23.564 ? 0.542 ns/op (+143.31)
+BigDecimals.largeScale3LayoutCharsToString avgt 15 23.424 ? 0.057 ns/op (+145.63)
+BigDecimals.largeScale3PlainToString avgt 15 22.991 ? 0.048 ns/op (+4.25)
+BigDecimals.smallScale2EngineeringToString avgt 15 11.641 ? 0.181 ns/op (+46.58)
+BigDecimals.smallScale2LayoutCharsToString avgt 15 11.589 ? 0.031 ns/op (+46.69)
+BigDecimals.smallScale2PlainToString avgt 15 11.594 ? 0.032 ns/op (+107.60)
+BigDecimals.smallScale3EngineeringToString avgt 15 24.684 ? 0.313 ns/op (+117.54)
+BigDecimals.smallScale3LayoutCharsToString avgt 15 25.058 ? 0.151 ns/op (+114.52)
+BigDecimals.smallScale3PlainToString avgt 15 24.194 ? 0.217 ns/op (+1.67) |
|
Can I add scale2 method to StringLatin1? Like this: package java.lang;
class StringLatin1 {
static String scale2(long value) {
long valueAbs = Math.abs(value);
long highInt = valueAbs / 100;
byte[] buf = new byte[Long.stringSize(highInt) + (value < 0 ? 4 : 3)];
int p = buf.length - 2;
writeDigitPair(buf, p, (int) (valueAbs % 100));
buf[--p] = '.';
getChars(highInt, p, buf);
if (value < 0) {
buf[0] = '-';
}
if (String.COMPACT_STRINGS) {
return new String(buf, LATIN1);
}
return new String(inflate(buf, 0, buf.length), UTF16);
}
}
class System {
private static void setJavaLangAccess() {
SharedSecrets.setJavaLangAccess(new JavaLangAccess() {
public String scale2(long i) {
return StringLatin1.scale2(i);
}
}
}
}I think this is more readable than using MethodHandle. |
I'd strongly advice against this.
View the |
…string_x1 # Conflicts: # src/java.base/share/classes/java/math/BigDecimal.java
|
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
|
@wenshao This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
|
/open |
|
@wenshao This pull request is now open |
|
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
|
❗ This change is not yet ready to be integrated. |
|
@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
|
@wenshao This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the |
I submitted PR #15555 before, and there were too many changes. I split it into multiple PRs with small changes. This one is one of them.
this PR removed the duplicate code for getChars in BigDecimal#StringBuilderHelper, i also make performance faster.
Please review and don't hesitate to critique my approach and patch.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16006/head:pull/16006$ git checkout pull/16006Update a local copy of the PR:
$ git checkout pull/16006$ git pull https://git.openjdk.org/jdk.git pull/16006/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16006View PR using the GUI difftool:
$ git pr show -t 16006Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16006.diff
Webrev
Link to Webrev Comment