-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8364766: C2: Improve Value() of DivI and DivL for non-constant inputs #26143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…er_interger_div_type # Conflicts: # src/hotspot/share/opto/divnode.cpp
…nterger_div_type
|
👋 Welcome back ichttt! A progress list of the required criteria for merging this PR into |
|
❗ This change is not yet ready to be integrated. |
|
@ichttt This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for contributing this enhancement, @ichttt! Your code is very well commented and thus easy to follow, and you tested your changes thoroughly. Nice work!
I mainly nitpicked on a few details below. But I have one question: You went through great pains to only calculate the necessary "corners". Would it not be much easier to calculate all four possible corners and let the min and max functions deal with the duplicates in case the i1 or i2 range is a singleton? The result should be the same (if I did not forget about a corner case) and it would be easier to follow.
What kind of testing did you run on your side? I kicked off a tier1 through 5 and will keep you posted on the results.
You still have a title mismatch between the issue and the PR (the PR is missing "C2:").
test/hotspot/jtreg/compiler/c2/irTests/IntegerDivValueTests.java
Outdated
Show resolved
Hide resolved
|
Testing is all green. |
|
Thanks for the fast review! The main reason for all the if cases is that min_int / (-1) is undefined behavior in C++, as it overflows. All code has to be careful that this special case can't happen in C++ code, and that's the main motivation behind all the ifs. I've added a comment that describes that. Regarding testing: I only ran tier1 tests on my machine and GHA |
Then I would suggest restructuring the code to express that intent. Reading it currently, gives me the impression that you care about handling the four corners carefully when you really only care about the UB case. The following pseudocode uses ifs to only avoid the corner case. It's only slightly different from your version, but I find it a bit clearer to guess its intent. What do you think? |
benoitmaillard
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this change, I think this an important optimization opportunity that was previously missing. The code is very clear. I only have one nit.
eme64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice idea, thanks for the work!
I started reading through a part of it and have a few questions / suggestions.
A more general suggestion (could also be a future RFE):
You could also use the unsigned bounds here. And you could also address UDIvI/L.
Now that we have unsigned bounds we could use them: #17508
You can read up in type.hpp, that an integer type has one or two simple intervals.
708 * 4. Either _lo == jint(_ulo) and _hi == jint(_uhi), or each element of a
709 * TypeInt lies in either interval [_lo, jint(_uhi)] or [jint(_ulo), _hi]
710 * (note that these intervals are disjoint in this case).
I think with this, you could do an even more powerful optimization.
Also: where ever I see these optimizations that work on ranges, and not just constants: we should do some more rigorous testing on those resulting ranges.
I see that you already have some concrete examples. It would be good to extend those with a completely randomized version. See for inspiration:
src/hotspot/share/opto/divnode.cpp
Outdated
| template<typename IntegerType> | ||
| static const IntegerType* compute_generic_div_type(const IntegerType* i1, const IntegerType* i2, int widen) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the generic in the name? The template already suggests that it can be used for different types, right?
Also: I'm wondering if we can somehow extend this for UDivI and UDIvL.
I suppose you would have to use the _ulo and _uhi instead of _lo and _hi.
I'm not saying this all has to be done in this PR, but we could at least anticipate the extension to unsigned division.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've adjusted it to compute_signed_div_type.
I would not worry about unsigned division now and would leave this for a later RFE. Would you perhaps create a issue for that? I have no permission for the JBS
test/hotspot/jtreg/compiler/c2/irTests/IntegerDivValueTests.java
Outdated
Show resolved
Hide resolved
test/hotspot/jtreg/compiler/c2/irTests/IntegerDivValueTests.java
Outdated
Show resolved
Hide resolved
|
@ichttt This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
|
@ichttt, are you still working on this? 🙂 |
|
Thanks for all your reviews! |
SirYwell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got a few comments
src/hotspot/share/opto/divnode.cpp
Outdated
| // Either input is BOTTOM ==> the result is the local BOTTOM | ||
| const Type *bot = bottom_type(); | ||
| if( (t1 == bot) || (t2 == bot) || | ||
| (t1 == Type::BOTTOM) || (t2 == Type::BOTTOM) ) | ||
| return bot; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be removed - and in cases where one side is the local bottom (i.e., TypeInt::INT) and the other is more restricted, the result should even more precise after removing. Could you also add tests for such cases? For example dividing TypeInt::INT by some interval with a lower bound of 2, the resulting range can be narrowed. Similarly, dividing some small interval [lo, hi] by TypeInt::INT should result in a similar interval with bounds adjusted to deal with sign changes. If I didn't miss something, your code should already be able to deal with this, it's just this early return here preventing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are correct. The only part where I am not sure is if every instance where i1/i2 can be Type::BOTTOM, i1/i2 can be cast to TypeInt.
Can someone please confirm the removal is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type::BOTTOM should not happen, since we are in a method DivI/L and thus know that the inputs should be some particular type. I think removing this check is fine, since you are doing t1->is_int() below, which would assert for t1 == Type::BOTTOM.
…ed DivI Nodes that did not constant fold
mhaessig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing all the comments and no worries about the delay.
This looks good to me now. I just kicked off testing and will report back with the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, testing revealed a problem in one of our internal tests, that I am unable to share. The Test is run with -Xomp -XX:-TieredCompilation and the error is
STDOUT:
485 ConvI2L === _ 486 [[ 483 ]] #long:minint..maxint, 0u..maxulong, widen: 3 !orig=153
152 ConL === 0 [[ 169 164 483 568 ]] #long:minlong
489 IfTrue === 488 [[ 483 490 ]] #1 !orig=161
483 DivL === 489 152 485 [[ 482 504 ]] !orig=169
told = long:maxint..2305843009213693952, widen: 3
tnew = long:minlong..9007199254740992, 0u..maxulong, widen: 3
#
# A fatal error has been detected by the Java Runtime Environment:
#
# Internal Error (/src/jdk-pr-26143/open/src/hotspot/share/opto/phaseX.cpp:2763), pid=108119, tid=108147
# fatal error: Not monotonic
#
# JRE version: Java(TM) SE Runtime Environment (26.0) (fastdebug build 26-internal-mhassig.open)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 26-internal.open, compiled mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x185c479] PhaseCCP::verify_type(Node*, Type const*, Type const*)+0x169
Stack: [0x00007f7e669eb000,0x00007f7e66aeb000], sp=0x00007f7e66ae5f10, free space=1003k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x185c479] PhaseCCP::verify_type(Node*, Type const*, Type const*)+0x169 (phaseX.cpp:2763)
V [libjvm.so+0x1864f0d] PhaseCCP::analyze()+0x3bd (phaseX.cpp:2806)
V [libjvm.so+0xb86554] Compile::Optimize()+0x964 (compile.cpp:2489)
V [libjvm.so+0xb89a63] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x2023 (compile.cpp:860)
V [libjvm.so+0x9a3a33] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x4a3 (c2compiler.cpp:147)
V [libjvm.so+0xb98f70] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x780 (compileBroker.cpp:2345)
V [libjvm.so+0xb9a7d0] CompileBroker::compiler_thread_loop()+0x530 (compileBroker.cpp:1989)
V [libjvm.so+0x10f2c4b] JavaThread::thread_main_inner()+0x13b (javaThread.cpp:771)
V [libjvm.so+0x1b63ce6] Thread::call_run()+0xb6 (thread.cpp:243)
V [libjvm.so+0x17d1188] thread_native_entry(Thread*)+0x128 (os_linux.cpp:883)
…nterger_div_type
|
@mhaessig I've added some new asserts to try and detect where it went wrong and merged the latest upstream. Can you run the failing test again please? |
|
Thank you @ichttt, I will kick off a run. I have been trying to extract a reproducer for you, but was not able to reproduce the failure again. Perhaps the asserts will help. |
mhaessig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing tier1 up to tier6 passed. If you move the test out of the irTest directory, this will be good to go.
mhaessig
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for all your work, @ichttt. Looks good to me now.
eme64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @ichttt ! I like all your comments, and thanks for all the test cases, including the randomized ones!
I just have a few minor suggestions :)
| @IR(failOn = {IRNode.DIV_I, IRNode.DIV_L, IRNode.URSHIFT_I, IRNode.URSHIFT_L, IRNode.RSHIFT_I, IRNode.RSHIFT_L, IRNode.MUL_I, IRNode.MUL_L, IRNode.ADD_I, IRNode.ADD_L, IRNode.SUB_I, IRNode.SUB_L, IRNode.AND_I, IRNode.AND_L}) | ||
| public int testIntConstantFoldingSpecialCase() { | ||
| // All constants available during parsing | ||
| return getIntConstant(Integer.MIN_VALUE) / getIntConstant(-1); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not add an IR rule that the div is still present after parsing? It seems you have already had the possible issue that javac optimized the div away, right? So this would ensure the optimization really does happen in C2, and that you are checking for the right kinds of nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider doing the same in other places in this file ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is already optimized during local GVN during parsing, so after parsing, these nodes are already gone
| //@IR(failOn = {IRNode.DIV_I, IRNode.DIV_L, IRNode.URSHIFT_I, IRNode.URSHIFT_L, IRNode.RSHIFT_I, IRNode.RSHIFT_L, IRNode.MUL_I, IRNode.MUL_L, IRNode.ADD_I, IRNode.ADD_L, IRNode.SUB_I, IRNode.SUB_L, IRNode.AND_I, IRNode.AND_L}) | ||
| // This results in a series of nodes due to DivLNode::Ideal and in particular transform_long_divide, which operates on non-constant divisors. | ||
| // transform_long_divide splits up the division into multiple other nodes, such as MulHiLNode, which does not have a good Value() implemantion. | ||
| // When JDK-8366815 is fixed, these rules should be reenabled | ||
| // Alternatively, a better MulHiLNode::Value() implemantion should also lead to constant folding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you have some temporary IR rule that now passes, but fails once JDK-8366815 is fixed? Otherwise, I'm afraid we will miss these comments here, and they will never be cleaned up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same elsewhere in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it is not really possible for this method. Depending on the random constants that have been chosen, a different sequence of nodes will be emitted by transform_long_divide, and sometimes a MulHiLNode will not be emitted and the expression may constant fold after all.
So I fear we just have to remember this, but as #27886 looks pretty close to being done as well, I guess this will not be too hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it some more, we can add IR validation to testLongRange, though. This one is not dependent on random constants, so it should always work. That way, we should notice the other rule as well

This PR improves the value of interger division nodes.
Currently, we only emit a good type if either input is constant. But we can also cover the generic case. It does that by finding the four corners of the division. This is guranteed to find the extrema that we can use for min/max. Some special logic is required for MIN_INT / -1, though, as this is a special case
We also need some special logic to handle ranges that cross zero, but in this case, we just need to check for the negative and positive range once.
This also cleans up and unifies the code paths for DivINode and DivLNode.
I've added some tests to validate the optimization. Without the changes, some of these tests fail.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/26143/head:pull/26143$ git checkout pull/26143Update a local copy of the PR:
$ git checkout pull/26143$ git pull https://git.openjdk.org/jdk.git pull/26143/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 26143View PR using the GUI difftool:
$ git pr show -t 26143Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/26143.diff
Using Webrev
Link to Webrev Comment