Skip to content

Conversation

@sunny868
Copy link
Contributor

@sunny868 sunny868 commented Aug 19, 2022

make images run-test TEST=tools/javac/lambda/T8031967.java will fails with StackOverflowError when I use args JTREG="VM_OPTIONS=-XX:TieredStopAtLevel=3" on aarch64 and LoongArch. So the stack size -Xss10m needs to be increased to improve the robustness of the testcase.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8292647: javac/lambda/T8031967.java fails with StackOverflowError when use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9934/head:pull/9934
$ git checkout pull/9934

Update a local copy of the PR:
$ git checkout pull/9934
$ git pull https://git.openjdk.org/jdk pull/9934/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 9934

View PR using the GUI difftool:
$ git pr show -t 9934

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9934.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Aug 19, 2022

👋 Welcome back sunny868! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Aug 19, 2022

@sunny868 The following label will be automatically applied to this pull request:

  • compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

…n use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch
@sunny868
Copy link
Contributor Author

RFR

@sunny868 sunny868 changed the title JDK-8292647:javac/lambda/T8031967.java fails with StackOverflowError when use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch 8292647:javac/lambda/T8031967.java fails with StackOverflowError when use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch Aug 20, 2022
@sunny868 sunny868 changed the title 8292647:javac/lambda/T8031967.java fails with StackOverflowError when use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch 8292647: javac/lambda/T8031967.java fails with StackOverflowError when use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch Aug 20, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 20, 2022
@mlbridge
Copy link

mlbridge bot commented Aug 20, 2022

Webrevs

@theRealAph
Copy link
Contributor

I'm not at all sure about this one.
It may be a bug. Before committing this patch we should find out why AArch64 and LoongArch are consuming more stack than others. 10M of stack is huge.

@sunny868
Copy link
Contributor Author

sunny868 commented Aug 22, 2022

I'm not at all sure about this one. It may be a bug. Before committing this patch we should find out why AArch64 and LoongArch are consuming more stack than others. 10M of stack is huge.

I do not have a particularly good idear to locate this problem find out why AArch64 and LoongArch are consuming more stack than others at present. From the test results, AArch64 require more stack space than X86_64. And on the same platform, tier3 -XX:TieredStopAtLevel=3 consumes more stack space than tier4-XX:TieredStopAtLevel=4.

On the x64 platform, tier4 needs at least set -Xss700k and tier3 needs at least set -Xss1m. On the aarch64 platform, tier4 needs at least set -Xss900k and tier3 needs at least set -Xss1228k.

For safety, I set to -Xss10m. Maybe it's more reasonable to change it to -Xss2m?

@nick-arm
Copy link
Contributor

Have you tried with -XX:+AdjustStackSizeForTLS to see how that affects it? Glibc will subtract an unknown amount of memory from your thread's stack to use for static TLS so the usable stack size will actually be slightly less than the -Xss value, and that amount varies between platforms. See here https://bugs.openjdk.org/browse/JDK-8227417 and here https://sourceware.org/bugzilla/show_bug.cgi?id=11787

@theRealAph
Copy link
Contributor

For safety, I set to -Xss10m. Maybe it's more reasonable to change it to -Xss2m?

I don't think so. If we were making automobiles and our model consumed 40% more fuel than our competitor's, our response would not be to solve the problem by fitting a larger fuel tank. The stack layout of AArch64 is, as close as we can make it, identical to that of x86-64. So, this difference in memory use is not expected and should be investigated.

@sunny868
Copy link
Contributor Author

Have you tried with -XX:+AdjustStackSizeForTLS to see how that affects it? Glibc will subtract an unknown amount of memory from your thread's stack to use for static TLS so the usable stack size will actually be slightly less than the -Xss value, and that amount varies between platforms. See here https://bugs.openjdk.org/browse/JDK-8227417 and here https://sourceware.org/bugzilla/show_bug.cgi?id=11787

Thanks. I had tried with -XX:+AdjustStackSizeForTLS, but test result is FAILED also. As theRealAph said,it may be necessary to find out why aarch64 needs more stack space than x64.

@sunny868
Copy link
Contributor Author

@theRealAph Do you have any good ways to observe the use of X86 and AArch64 stack space?

@theRealAph
Copy link
Contributor

@theRealAph Do you have any good ways to observe the use of X86 and AArch64 stack space?

If it were me:
I'd run the test case in GDB. At the point the StackOverflowException happened, I'd trigger an infinite loop, then dump the stack using the pfl() call. I'd also use GDB's bt to view information about the native stack. I'd do that for both x86 and AArch64, to see what's causing the difference, and to see how significant it is.

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 20, 2022

@sunny868 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 19, 2022

@sunny868 This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Oct 19, 2022
@sunny868
Copy link
Contributor Author

sunny868 commented Feb 8, 2023

I'm not at all sure about this one. It may be a bug. Before committing this patch we should find out why AArch64 and LoongArch are consuming more stack than others. 10M of stack is huge.

@theRealAph I tested and found that some large methods use more spill code on LoongArch64 architectures than x86_64 architectures, which results in larger stack space requests.


// src/hotspot/share/c1/c1_FrameMap.cpp

279 ByteSize FrameMap::sp_offset_for_monitor_base(const int index) const {          
280   int end_of_spills = align_up(first_available_sp_in_frame + _reserved_argument_area_size, (int)sizeof(double)) +
281     _num_spills * spill_slot_size_in_bytes;                                     
282   int offset = align_up(end_of_spills, HeapWordSize) + index * (int)sizeof(BasicObjectLock);
283   return in_ByteSize(offset);                                                   
284 }  

Here, var _num_spills is sometimes larger on loongarch64 than x86_64, then framesize is larger.
So I think using more stack space on loongarch64 is the best solution. I'm not sure if other architectures(s390/risc-v/ppc) have the same issue

@sunny868
Copy link
Contributor Author

sunny868 commented Feb 8, 2023

/open

@openjdk openjdk bot reopened this Feb 8, 2023
@openjdk
Copy link

openjdk bot commented Feb 8, 2023

@sunny868 This pull request is now open

…n use -XX:TieredStopAtLevel=3 on aarch64 and LoongArch
@openjdk
Copy link

openjdk bot commented Feb 8, 2023

@sunny868 Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@sunny868
Copy link
Contributor Author

sunny868 commented Feb 9, 2023

Please help review this patch again, thank you all.

@nick-arm
Copy link
Contributor

nick-arm commented Feb 9, 2023

How can I reproduce this? I tried make test TEST=tools/javac/lambda/T8031967.java JTREG="VM_OPTIONS=-XX:TieredStopAtLevel=3" several times on a fastdebug build but it didn't fail.

@sunny868
Copy link
Contributor Author

sunny868 commented Feb 10, 2023

I don't remember on which version I tested it for aarch64, I'll confirm again.

@lgxbslgx
Copy link
Member

Seems like an issue about the JIT compiler. So it is good to CC hotspot-compiler mailing list.

/cc hotspot-compiler

@openjdk
Copy link

openjdk bot commented Feb 10, 2023

@lgxbslgx
The hotspot-compiler label was successfully added.

@sunny868
Copy link
Contributor Author

Sorry, I can't confirm which version reproduces the issue on aarch64. But the conclusion that aarch64 requires more stack space than x86_64 holds. i.e. java.lang.Object::<init> has a 0x30 stack size on x86_64 platforms, but 0x40 stack size on aarch64.

For x86_64


----------------------------------- Assembly -----------------------------------
                                                                                
Compiled method (c1)      70    1       3       java.lang.Object:: (1 bytes)
...
[Verified Entry Point]                                                          
  0x00007fb8191cc740:   mov    %eax,-0x18000(%rsp)                              
  0x00007fb8191cc747:   push   %rbp                                             
  0x00007fb8191cc748:   sub    $0x30,%rsp    // stack size is 0x30

For aarch64


----------------------------------- Assembly -----------------------------------

Compiled method (c1)      64    1       3       java.lang.Object:: (1 bytes)
...
[Verified Entry Point]
  0x0000ffff9c22b980:   nop
  0x0000ffff9c22b984:   sub	x9, sp, #0x19, lsl #12
  0x0000ffff9c22b988:   str	xzr, [x9]
  0x0000ffff9c22b98c:   sub	sp, sp, #0x40          //stack size is 0x40, larger than x86_64
  0x0000ffff9c22b990:   stp	x29, x30, [sp, #48]

@sunny868
Copy link
Contributor Author

And I found the above difference comes from the different ways in which framesize is calculated.


// src/hotspot/cpu/aarch64/c1_LIRAssembler_aarch64.cpp

int LIR_Assembler::initial_frame_size_in_bytes() const {                        
  return in_bytes(frame_map()->framesize_in_bytes());                           
}

// src/hotspot/cpu/x86/c1_LIRAssembler_x86.cpp

int LIR_Assembler::initial_frame_size_in_bytes() const {                        
  return (frame_map()->framesize() - (2*VMRegImpl::slots_per_word))  * VMRegImpl::stack_slot_size;
}  

@nick-arm
Copy link
Contributor

I think the - (2*VMRegImpl::slots_per_word) in the x86 version is accounting for the stack space consumed by push %rbp and the call instruction which pushes the return address to the stack. So in both cases the stack frame is 0x40 bytes in total.

@sunny868
Copy link
Contributor Author

After #12548, this test can be passed.

@sunny868
Copy link
Contributor Author

After #12548, this test can be passed.

Thanks @theRealAph @nick-arm @lgxbslgx for review. As #12548 had merged, I close this PR.

@sunny868 sunny868 closed this Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants