Skip to content

Conversation

@wenshao
Copy link
Contributor

@wenshao wenshao commented Jun 8, 2024

After PR #16245, C2 optimizes stores into primitive arrays by combining values ​​into larger stores. In the UUID.toString method, ByteArrayLittleEndian can be removed, making the code more elegant and faster.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8333833: Remove the use of ByteArrayLittleEndian from UUID::toString (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19610/head:pull/19610
$ git checkout pull/19610

Update a local copy of the PR:
$ git checkout pull/19610
$ git pull https://git.openjdk.org/jdk.git pull/19610/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19610

View PR using the GUI difftool:
$ git pr show -t 19610

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19610.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 8, 2024

👋 Welcome back wenshao! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@wenshao wenshao changed the title optimization for UUID toString UUID toString removes the use of ByteArrayLittleEndian Jun 8, 2024
@openjdk
Copy link

openjdk bot commented Jun 8, 2024

@wenshao This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8333833: Remove the use of ByteArrayLittleEndian from UUID::toString

Reviewed-by: liach, redestad

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 19 new commits pushed to the master branch:

  • de55db2: 8333522: JFR SwapSpace event might read wrong free swap space size
  • a941397: 8329031: CPUID feature detection for Advanced Performance Extensions (Intel® APX)
  • 8d2f9e5: 8333749: Consolidate ConstantDesc conversion in java.base
  • a6fc2f8: 8333412: [s390x] Add support for branch on count instruction
  • cf677c9: 8333823: Update --release 23 symbol information for JDK 23 build 26
  • 18e7d7b: 8333716: Shenandoah: Check for disarmed method before taking the nmethod lock
  • c37d02a: 8312412: Uninitialized klassVtable::_verify_count field
  • 17bd483: 8333680: com/sun/tools/attach/BasicTests.java fails with "SocketException: Permission denied: connect"
  • 512b2b4: 8330420: Inverted use of DisplayVMOutputToStderr in ostream_exit
  • 8e72d7c: 8320448: Accelerate IndexOf using AVX2
  • ... and 9 more: https://git.openjdk.org/jdk/compare/8ffc35d117846a7a2aa08afed662273d2f887770...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@cl4es) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk
Copy link

openjdk bot commented Jun 8, 2024

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 8, 2024

The performance numbers under MacBookPro M1 Max are as follows:

-Benchmark           (size)   Mode  Cnt    Score   Error   Units (#master 8ffc35d117846a7a2aa08afed662273d2f887770 )
-UUIDBench.toString   20000  thrpt   15  103.904 ? 0.772  ops/us

+Benchmark           (size)   Mode  Cnt    Score   Error   Units (# current 30373b81fddbf7e82340e466cf6425a5252399d2 )
+UUIDBench.toString   20000  thrpt   15  109.529 ? 1.156  ops/us + 5.41%

@wenshao wenshao changed the title UUID toString removes the use of ByteArrayLittleEndian 8327791: UUID toString removes the use of ByteArrayLittleEndian Jun 8, 2024
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 8, 2024
@mlbridge
Copy link

mlbridge bot commented Jun 8, 2024

Copy link
Member

@liach liach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean bug ID 8333833, right?

@wenshao wenshao changed the title 8327791: UUID toString removes the use of ByteArrayLittleEndian 8333833: UUID toString removes the use of ByteArrayLittleEndian Jun 8, 2024
@sunmisc
Copy link

sunmisc commented Jun 8, 2024

As far as I know, ByteArrayLittleEndian uses the VarHandle mechanism, which more efficiently writes different primitives into the array, unlike the basic

@liach
Copy link
Member

liach commented Jun 8, 2024

@sunmisc BALE uses byte array view VH which still uses Unsafe:

Please take a look at #16245; you will see that C2 now JIT compiles these compatible "different primitives" like Unsafe would do, yet there's a bit of requirement on code shape. Thus I recommended the comment for wenshao, so future changes won't accidentally destroy the code shape and the optimization.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 8, 2024

The C2 optimization brought by PR #16245 makes many of the previous performance improvement techniques based on VarHandle/ByteArray/Unsafe no longer meaningful, and many optimizations based on this need to be changed back.

@liach
Copy link
Member

liach commented Jun 8, 2024

And in addition, VarHandle is not initialized unless it's necessary; thus, programs that use UUIDs but not VarHandle no longer need to initialize VarHandle. See #15386 where JDK startup has a performance degradation because it had to initialize VarHandle after using BALE.

@Glavo
Copy link
Contributor

Glavo commented Jun 8, 2024

I think we don't need to change them back everywhere, but only need to rewrite ByteArrayLittleEndian and ByteArray so that they no longer use VarHandle.

Maybe I should rewrite #14636 without using Unsafe, so more people might agree with it.

| (DIGITS[b1 & 0xff] << 16)
| (((long) DIGITS[b2 & 0xff]) << 32)
| (((long) DIGITS[b3 & 0xff]) << 48);
public static void putHex(byte[] buffer, int off, int i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be 2 methods - for 2 and 4 bytes respectively?
Does c2 optimize 8 byte writes as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4-byte unsigned int input for 8-byte write sounds plausible, I personally am fine either with or without it.

Does c2 optimize 8 byte writes as well?

From the first few lines of #16245:

Merging multiple consecutive small stores (e.g. 8 byte stores) into larger stores (e.g. one long store) can lead to speedup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8-byte writing requires converting int to long. The performance is similar to the current version, but an additional method putHex8 needs to be added. The current version has less code.

The following is the code for writing 8 bytes:

class UUID {
     @Override
    public String toString() {
        byte[] buf = new byte[36];
        HexDigits.putHex8(buf, 0, (int) (mostSigBits >> 32));
        HexDigits.putHex10(buf, 8, (int) mostSigBits);
        HexDigits.putHex10(buf, 18, (int) (leastSigBits >> 32));
        HexDigits.putHex8(buf, 28, (int) leastSigBits);
        try {
            return jla.newStringNoRepl(buf, StandardCharsets.ISO_8859_1);
        } catch (CharacterCodingException cce) {
            throw new AssertionError(cce);
        }
    }
}

class HexDigits {
    public static void putHex8(byte[] bytes, int off, int i) {
        long v =  (((long) DIGITS[(i >> 16) & 0xff]) << 48)
                | (((long) DIGITS[(i >> 24) & 0xff]) << 32)
                | (        DIGITS[ i        & 0xff]  << 16)
                | (        DIGITS[(i >> 8 ) & 0xff]);
        bytes[off]     = (byte)  v;
        bytes[off + 1] = (byte) (v >> 8);
        bytes[off + 2] = (byte) (v >> 16);
        bytes[off + 3] = (byte) (v >> 24);
        bytes[off + 4] = (byte) (v >> 32);
        bytes[off + 5] = (byte) (v >> 40);
        bytes[off + 6] = (byte) (v >> 48);
        bytes[off + 7] = (byte) (v >> 56);
    }

    public static void putHex10(byte[] bytes, int off, int i) {
        int v0 = (DIGITS[(i >> 16) & 0xff] << 16)
                | DIGITS[(i >> 24) & 0xff];
        int v1 = (DIGITS[i         & 0xff] << 16)
                | DIGITS[(i >> 8 ) & 0xff];
        bytes[off]     = '-';
        bytes[off + 1] = (byte)  v0;
        bytes[off + 2] = (byte) (v0 >> 8);
        bytes[off + 3] = (byte) (v0 >> 16);
        bytes[off + 4] = (byte) (v0 >> 24);
        bytes[off + 5] = '-';
        bytes[off + 6] = (byte)  v1;
        bytes[off + 7] = (byte) (v1 >> 8);
        bytes[off + 8] = (byte) (v1 >> 16);
        bytes[off + 9] = (byte) (v1 >> 24);
    }
}

@wenshao
Copy link
Contributor Author

wenshao commented Jun 9, 2024

I think we don't need to change them back everywhere, but only need to rewrite ByteArrayLittleEndian and ByteArray so that they no longer use VarHandle.

Maybe I should rewrite #14636 without using Unsafe, so more people might agree with it.

You are right, ByteArray and ByteArrayLittleEndian have good performance after removing Unsafe. This is similar to the previous version of java.io.Bits

class ByteArrayLittleEndian {
    public static void setInt(byte[] array, int offset, int value) {
        array[offset    ] = (byte)  value;
        array[offset + 1] = (byte) (value >> 8);
        array[offset + 2] = (byte) (value >> 16);
        array[offset + 3] = (byte) (value >> 24);
    }
}

class HexDigits {
    public static void putHex4(byte[] array, int offset, int value) {
        // Prepare an int value so C2 generates a 4-byte write instead of two 2-byte writes
        ByteArrayLittleEndian.setInt(
                array,
                offset,
                (DIGITS[value & 0xff] << 16) | DIGITS[(value >> 8) & 0xff]);
    }
}

@Glavo
Copy link
Contributor

Glavo commented Jun 9, 2024

You are right, ByteArray and ByteArrayLittleEndian have good performance after removing Unsafe. This is similar to the previous version of java.io.Bits

Do you have evidence that VarHandle affects startup time? If there is good evidence to support this, I would go ahead and rewrite #14636.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 9, 2024

You are right, ByteArray and ByteArrayLittleEndian have good performance after removing Unsafe. This is similar to the previous version of java.io.Bits

Do you have evidence that VarHandle affects startup time? If there is good evidence to support this, I would go ahead and rewrite #14636.

@cl4es has fixed startup regression issues, such as this #15836

Copy link
Member

@cl4es cl4es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see #16245 in action, enabling simpler code with equal or better performance.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 9, 2024
@wenshao
Copy link
Contributor Author

wenshao commented Jun 10, 2024

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 10, 2024
@openjdk
Copy link

openjdk bot commented Jun 10, 2024

@wenshao
Your change (at version 035a07a) is now ready to be sponsored by a Committer.

@Glavo
Copy link
Contributor

Glavo commented Jun 10, 2024

I opened #19616, let's continue the unfinished discussion in #14636 there.

@openjdk openjdk bot removed the sponsor Pull request is ready to be sponsored label Jun 10, 2024
@cl4es
Copy link
Member

cl4es commented Jun 10, 2024

Thanks. I hate to nitpick, but is it OK if I rename the RFE as "Remove the use of ByteArrayLittleEndian from UUID::toString" (the PR need to follow suit). I think the current name might be read as doing something completely different.

@wenshao wenshao changed the title 8333833: UUID toString removes the use of ByteArrayLittleEndian 8333833: Remove the use of ByteArrayLittleEndian from UUID::toString Jun 10, 2024
@cl4es
Copy link
Member

cl4es commented Jun 10, 2024

Great, go ahead and /integrate again and I'll sponsor.

@wenshao
Copy link
Contributor Author

wenshao commented Jun 10, 2024

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 10, 2024
@openjdk
Copy link

openjdk bot commented Jun 10, 2024

@wenshao
Your change (at version 780abdf) is now ready to be sponsored by a Committer.

@cl4es
Copy link
Member

cl4es commented Jun 10, 2024

/sponsor

@openjdk
Copy link

openjdk bot commented Jun 10, 2024

Going to push as commit 8aa35ca.
Since your change was applied there have been 19 commits pushed to the master branch:

  • de55db2: 8333522: JFR SwapSpace event might read wrong free swap space size
  • a941397: 8329031: CPUID feature detection for Advanced Performance Extensions (Intel® APX)
  • 8d2f9e5: 8333749: Consolidate ConstantDesc conversion in java.base
  • a6fc2f8: 8333412: [s390x] Add support for branch on count instruction
  • cf677c9: 8333823: Update --release 23 symbol information for JDK 23 build 26
  • 18e7d7b: 8333716: Shenandoah: Check for disarmed method before taking the nmethod lock
  • c37d02a: 8312412: Uninitialized klassVtable::_verify_count field
  • 17bd483: 8333680: com/sun/tools/attach/BasicTests.java fails with "SocketException: Permission denied: connect"
  • 512b2b4: 8330420: Inverted use of DisplayVMOutputToStderr in ostream_exit
  • 8e72d7c: 8320448: Accelerate IndexOf using AVX2
  • ... and 9 more: https://git.openjdk.org/jdk/compare/8ffc35d117846a7a2aa08afed662273d2f887770...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 10, 2024
@openjdk openjdk bot closed this Jun 10, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jun 10, 2024
@openjdk
Copy link

openjdk bot commented Jun 10, 2024

@cl4es @wenshao Pushed as commit 8aa35ca.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

7 participants