Skip to content

Conversation

@wenshao
Copy link
Contributor

@wenshao wenshao commented Feb 3, 2025

The following code can reproduce the problem, writing out of bounds causes JVM Crash

         StringBuilder buf = new StringBuilder();
        buf.append('中');

        Thread[] threads = new Thread[40];
        final CountDownLatch latch = new CountDownLatch(threads.length);
        Runnable r = () -> {
            for (int i = 0; i < 1000000; i++) {
                buf.setLength(0);
                buf.trimToSize();
                buf.append(123456789123456789L);
            }
            latch.countDown();
        };

        for (int i = 0; i < threads.length; i++) {
            threads[i] = new Thread(r);
        }
        for (Thread t : threads) {
            t.start();
        }
        latch.await();

This problem can be avoided by using the value of ensureCapacityInternal directly.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8349241: Fix the concurrent execution JVM crash of StringBuilder::append(int/long) (Bug - P3)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23427/head:pull/23427
$ git checkout pull/23427

Update a local copy of the PR:
$ git checkout pull/23427
$ git pull https://git.openjdk.org/jdk.git pull/23427/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23427

View PR using the GUI difftool:
$ git pr show -t 23427

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23427.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 3, 2025

👋 Welcome back swen! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 3, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Feb 3, 2025

@wenshao The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@wenshao wenshao changed the title Fix the concurrent execution crash of StringBuilder::append(int/long) Fix the concurrent execution JVM crash of StringBuilder::append(int/long) Feb 3, 2025
int spaceNeeded = count + DecimalDigits.stringSize(i);
ensureCapacityInternal(spaceNeeded);
byte[] value = ensureCapacityInternal(spaceNeeded);
if (isLatin1()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not safe. The ensureCapacityInternal can read coder == LATIN1 and allocate a small array, but this isLatin1 can read coder == UTF16 and write a UTF16 number out of bounds.

Copy link
Member

@cl4es cl4es Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check that spaceNeeded <= (value.length >> 1) in the else branch would be needed and might be a sufficient safeguard here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made further improvements to improve the thread safety of the coder by passing the newCapacity method into the coder. I think this should be safe enough.

@wenshao wenshao changed the title Fix the concurrent execution JVM crash of StringBuilder::append(int/long) 8349241: Fix the concurrent execution JVM crash of StringBuilder::append(int/long) Feb 3, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 3, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 3, 2025

@wenshao wenshao requested a review from liach February 3, 2025 19:16
@AlanBateman
Copy link
Contributor

/reviewers 2 reviewer

@openjdk
Copy link

openjdk bot commented Feb 4, 2025

@AlanBateman
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

@tstuefe
Copy link
Member

tstuefe commented Feb 4, 2025

I am confused about the description. How exactly is the JVM crashed? Does the interpreter or compiled code crash?

@wenshao
Copy link
Contributor Author

wenshao commented Feb 4, 2025

hs_err_pid23348.log

 StringBuilder buf = new StringBuilder();
    buf.append('中');

    final CountDownLatch latch = new CountDownLatch(10);
    Runnable r = () -> {
        for (int i = 0; i < 10000; i++) {
            buf.setLength(0);
            buf.trimToSize();
            buf.append(123456789);
        }
        latch.countDown();
    };
    Thread[] threads = new Thread[10];
    for (int i = 0; i < threads.length; i++) {
        threads[i] = new Thread(r);
    }
    for (Thread t : threads) {
        t.start();
    }
    latch.await();

Will cause the JVM to exit directly, the error message is as follows

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGBUS (0xa) at pc=0x0000000103b9d23c, pid=23348, tid=33539
#
# JRE version: OpenJDK Runtime Environment (25.0) (build 25-internal-adhoc.wenshao.jdkx)
# Java VM: OpenJDK 64-Bit Server VM (25-internal-adhoc.wenshao.jdkx, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-aarch64)
# Problematic frame:
# [thread 41731 also had an error]
V  [libjvm.dylib+0x41523c]  G1ParScanThreadState::copy_to_survivor_space(G1HeapRegionAttr, oopDesc*, markWord)+0x64
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
[thread 39683 also had an error]
# An error report file with more information is saved as:
# /Users/wenshao/Work/git/jdk_mico_bench/hs_err_pid23348.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

@tstuefe
Copy link
Member

tstuefe commented Feb 4, 2025

@wenshao Thank you. This seems to be a GC problem. I adjusted the JBS issue accordingly. You set this to "24" as affected version, but if this is a mainline issue, please add 25 and if possible all other versions this occurs in. If possible, please attach an hs-err file or at least the crash stack.

@wenshao
Copy link
Contributor Author

wenshao commented Feb 4, 2025

@wenshao Thank you. This seems to be a GC problem. I adjusted the JBS issue accordingly. You set this to "24" as affected version, but if this is a mainline issue, please add 25 and if possible all other versions this occurs in. If possible, please attach an hs-err file or at least the crash stack.

I added the hs-err file in the reply above. This is not a GC problem. The getChars method uses StringUTF16.putChar, which is equivalent to Unsafe.putChar. There is no out-of-bounds check. When concurrent, out-of-bounds writes will occur, causing JVM Crash.

@liach
Copy link
Member

liach commented Feb 4, 2025

On a second examination, I find this is caused by #22023, which is not in 24. And I cannot replicate the oob write on 24. I have updated the affected version to 25 as a result, and updated the caused-by link in the JBS issue.

* greater than (Integer.MAX_VALUE >> coder)
*/
private int newCapacity(int minCapacity) {
private static int newCapacity(int minCapacity, byte[] value, byte coder) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Only the current length is needed and should be the argument.
  2. Add the @param tags for the new arguments.

*/
private byte[] ensureCapacityInternal(int minimumCapacity, byte coder) {
// overflow-conscious code
byte[] value = this.value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shadowing value is a bug prone, pick a new name.

* synchronized.
* If {@code minimumCapacity} is non positive due to numeric
* overflow, this method throws {@code OutOfMemoryError}.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete the javadoc with the @param tags and descriptions.

@wenshao
Copy link
Contributor Author

wenshao commented Feb 4, 2025

Thanks @RogerRiggs, your suggestion is great, I have fixed it, please help me review it again.

@tstuefe
Copy link
Member

tstuefe commented Feb 5, 2025

@wenshao Thank you. This seems to be a GC problem. I adjusted the JBS issue accordingly. You set this to "24" as affected version, but if this is a mainline issue, please add 25 and if possible all other versions this occurs in. If possible, please attach an hs-err file or at least the crash stack.

I added the hs-err file in the reply above. This is not a GC problem. The getChars method uses StringUTF16.putChar, which is equivalent to Unsafe.putChar. There is no out-of-bounds check. When concurrent, out-of-bounds writes will occur, causing JVM Crash.

@wenshao I see. Yes, you are right. Interesting - I was not aware of JDK code using unsafe-like put calls internally.

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 5, 2025

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@wenshao
Copy link
Contributor Author

wenshao commented Mar 9, 2025

keep alive

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 6, 2025

@wenshao This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented May 4, 2025

@wenshao This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this May 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs [email protected] rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

6 participants