-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8351091: Shenandoah: global marking context completeness is not accurately maintained #23886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…neration when use complete marking context
|
👋 Welcome back xpeng! A progress list of the required criteria for merging this PR into |
|
@pengxiaolong This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 151 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@earthling-amzn, @ysramakrishna) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
@pengxiaolong The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
…turn false if old gen marking is not complete
src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp
Outdated
Show resolved
Hide resolved
Webrevs
|
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a few questions and comments inline. I'll take a closer look once you have responded to those.
Thank you for finding this probably long-standing incorrectness/fuzziness and fixing it properly!
| virtual void set_mark_complete(); | ||
| virtual void set_mark_incomplete(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these declared virtual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see that ShenandoahGlobalGeneration forces the state of ShenandoahOdGeneration and ShenandoahYoungGeneration, but is that our intention? I am seeing (see comment elsewhere) that we are always either using global generation's marking context explicitly, or using a region to index into the appropriate containing generation's marking context. If so, can we dispense with the forcing of global context's state into the contexts for the two generations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, can we dispense with the forcing of global context's state into the contexts for the two generations?
I think we can do that if we deprecated the classical mode and only support generational Shenandoah, in classical mode, there is only global generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow. In the legacy (non-generational mode) we shouldn't care about the marking state of the old and young generations, just that of the GlobalGeneration. In the generational case, we explicitly track the marking state of the old and young generations explicitly. It sounds to me as if forcing the Old and Young marking states to the state of that of the GlobalGeneration must be exactly for the case where we are using Generational Shenandoah, and we are doing a Global collection? Indeed:
void ShenandoahGlobalGeneration::set_mark_complete() {
ShenandoahGeneration::set_mark_complete();
if (ShenandoahHeap::heap()->mode()->is_generational()) {
ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap();
heap->young_generation()->set_mark_complete();
heap->old_generation()->set_mark_complete();
}
}
I am saying that each of Old, Young, and Global generations maintain their own mark completion state and use that to determine what they pass back in response to complete_marking_context(). This completely localizes all state rather than unnecessarily and confusingly coupling the states of these generations.
So, you remove the part in the if branch in the code above, which reduces to the default (or rather only) implementation in the base class, not requiring the over-ride of the Global generation's method for the generational case.
void ShenandoahGeneration::set_mark_complete() {
_is_marking_complete.set();
}
It is possible that I am still missing the actual structure here that requires the override for GlobalGeneration for the generational case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I misunderstood your original proposal, I thought you meant to suggest to remove the flag from ShenandoahGlobalGeneration, instead the set_mark_complete/is_mark_complete will more like view/delegation layer like:
void ShenandoahGlobalGeneration::set_mark_complete() {
ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap();
heap->young_generation()->set_mark_complete();
heap->old_generation()->set_mark_complete();
}
bool ShenandoahGlobalGeneration::is_mark_complete() {
ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap();
return heap->young_generation()->is_mark_complete() && heap->old_generation()->is_mark_complete();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You proposal will make the impl of the set_mark_complete/is_mark_complete of ShenandoahGeneration cleaner, but the thing is it will change current design and behavior, we may have to update the code where there methods is called, e.g. when we call set_mark_complete of gc_generation/active_generation, if it is global generation, we may have to explicitly call the same methods of ShenandoahYoungGeneration and ShenandoahOldGeneration to fan out the status.
How about I follow up it in a separate task and update the implementation if necessary? I want to limit the changes involved in this PR, and only fix the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The young and old generations are only instantiated in the generational mode, so using them without checking the mode will result in SEGV in non-generational modes.
Global collections have a lot of overlap with old collections. I think what Ramki is saying, is that if we change all the code that makes assertions about the completion status of young/old marking to use the active_generation field instead, then we wouldn't need to update the completion status of young/old during a global collection. The difficulty here is that we need assurances that the old generation mark bitmap is valid in collections subsequent to a global collection. So, I don't think we can rely on completion status of active_generation when it was global, in following collections where it may now refer to young or old.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Yes, that makes sense to me, thanks William.
It would then be the case for the global generation that if is_mark_complete() then in the generational case that's also the case for both of its constituent generations. May be we can assert that when we fetch that at line 204 (and find it's true) for the case where we make the method is_mark_complete() also virtual :-) ?
May be I am being paranoid, but the assert would make me feel confident that the state maintenance isn't going awry/askew.
|
|
||
| #ifdef ASSERT | ||
| ShenandoahMarkingContext* const ctx = _heap->complete_marking_context(); | ||
| ShenandoahMarkingContext* const ctx = _heap->marking_context(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not this instead?
ShenandoahMarkingContext* const ctx = _heap->marking_context(r);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically there is only one global marking context for Shenandoah, even in generational mode, passing the region to marking_context doesn't make any difference.
But in the method complete_marking_context(r), it checks if the affiliated generation has complete marking, it is a more convenient version of complete_marking_context(affiliation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yes, that makes sense. Why not then use both ShenandoahHeap::[complete_]marking_context() as synonyms for ShehandoahHeap::active_generation()->[complete_]marking_context(). See other related comments in this review round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel using henandoahHeap::complete_marking_context() as synonyms for ShehandoahHeap::active_generation()->[complete_]marking_context() may cause more confusion, just read from the name it seems that it indicates the marking is complete for the whole heap, not just the active generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, that makes sense.
src/hotspot/share/gc/shenandoah/shenandoahReferenceProcessor.cpp
Outdated
Show resolved
Hide resolved
…erence processing
Thanks, I'll update PR to address your comments. |
earthling-amzn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we always get the complete marking context directly through the generation, we can delete ShenandoahHeap::complete_marking_context.
src/hotspot/share/gc/shenandoah/shenandoahGenerationalEvacuationTask.cpp
Outdated
Show resolved
Hide resolved
src/hotspot/share/gc/shenandoah/heuristics/shenandoahHeuristics.cpp
Outdated
Show resolved
Hide resolved
This reverts commit 2004973.
True, we don't really need it anymore, I'll update the PR and remove it. |
earthling-amzn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning this up.
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments, mostly pertaining to global gen's "complete" marking context semantics and usage, as well as SH::[*_]marking_context delegating to its active_generation()'s method.
This should be my last round of comments. Thank you for your patience...
| if (_heap->gc_generation()->is_global()) { | ||
| return _heap->marking_context(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the point of this change in behavior. What purpose does a partial marking context serve? Why not just leave the behavior as was before and return a non-null marking context only when marking is complete and null otherwise. When the client uses the context, it does so to skip over unmarked objects (which are dead if marking is complete), which might end up being too weak if we are still in the midst of marking.
I realize that you may not be maintaining a global mark completion so you are returning the marking context irrespective of the state of completion of the marking, but I wonder if that is really the bahavior you want. I would rather, as necessary, we maintain a flag for completion of global marking for the case where we are doing a global gc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is confusing that it looks like a behavior change, but actually there is no behavior change in this method, all the change here is to make the behavior of this method to be exactly same a before.
The old impl always return the the marking context, regardless the completeness status of marking, because the old _heap->complete_marking_context() always return w/o assert error due to inaccurate completeness marking status in the marking context, we are fixing the issue in this PR which breaks the old impl of this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method get_marking_context_for_old is called at line 1363 in method verify_rem_set_before_mark, as the name indicates it could be called before mark.
If current gc generation is global gc, the old marking flag should have set to false when the global flag is set, it is a bit wired I'm not sure if we should change it now, but I think we will have to correct/update the impl of this method later when update the design of completeness flags of global/young/old generations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's my thinking:
The clients of this method do not want to use an incomplete marking context. We either want to look at all the objects (when marking information is incomplete) or we want complete marking context in which case we will skip over dead objects. Hence my reservation about this change.
| // contained herein. | ||
| void ShenandoahGenerationalEvacuationTask::promote_in_place(ShenandoahHeapRegion* region) { | ||
| ShenandoahMarkingContext* const marking_context = _heap->complete_marking_context(); | ||
| ShenandoahMarkingContext* const marking_context = _heap->young_generation()->complete_marking_context(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For clarity, you might assert the following before line 172:
assert(gc_generation() == _heap->young_generation(), "Sanity check");
Even though it might seem somewhat tautological.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: Does Shenandoah promote region in global cycles? the gc_generation might be global if so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I don't see any reason promotions should be verboten in global cycles. cc @earthling-amzn ?
If that is indeed the case, a clean separation and maintenance of completeness of marking for global generation, and use of _heap->gc_generation() would make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the confirmation, I added assert as below since it gc_generation could be global :
assert(!_heap->gc_generation()->is_old(), "Sanity check");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assert may be fine, but the treatment of completeness of the marking context seems very brittle to me and apt to cause problems in the future. I would prefer a cleaner separation of these. May be we can sync up separately to discuss this along with @earthling-amzn .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, regions may be promoted during a global cycle. Completing the mark for a global cycle also completes the mark for the young and old generations.
| virtual void set_mark_complete(); | ||
| virtual void set_mark_incomplete(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I follow. In the legacy (non-generational mode) we shouldn't care about the marking state of the old and young generations, just that of the GlobalGeneration. In the generational case, we explicitly track the marking state of the old and young generations explicitly. It sounds to me as if forcing the Old and Young marking states to the state of that of the GlobalGeneration must be exactly for the case where we are using Generational Shenandoah, and we are doing a Global collection? Indeed:
void ShenandoahGlobalGeneration::set_mark_complete() {
ShenandoahGeneration::set_mark_complete();
if (ShenandoahHeap::heap()->mode()->is_generational()) {
ShenandoahGenerationalHeap* heap = ShenandoahGenerationalHeap::heap();
heap->young_generation()->set_mark_complete();
heap->old_generation()->set_mark_complete();
}
}
I am saying that each of Old, Young, and Global generations maintain their own mark completion state and use that to determine what they pass back in response to complete_marking_context(). This completely localizes all state rather than unnecessarily and confusingly coupling the states of these generations.
So, you remove the part in the if branch in the code above, which reduces to the default (or rather only) implementation in the base class, not requiring the over-ride of the Global generation's method for the generational case.
void ShenandoahGeneration::set_mark_complete() {
_is_marking_complete.set();
}
It is possible that I am still missing the actual structure here that requires the override for GlobalGeneration for the generational case.
| #ifdef ASSERT | ||
| bool reg_live = region->has_live(); | ||
| bool bm_live = ctx->is_marked(cast_to_oop(region->bottom())); | ||
| bool bm_live = heap->active_generation()->complete_marking_context()->is_marked(cast_to_oop(region->bottom())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apropos of another comment, if we really want to keep a delegating method in ShenandoahHeap, why not use heap->complete_marking_context() as a synonym for heap->active_generation()->complete_marking_context() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me.
|
|
||
| #ifdef ASSERT | ||
| ShenandoahMarkingContext* const ctx = _heap->complete_marking_context(); | ||
| ShenandoahMarkingContext* const ctx = _heap->marking_context(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yes, that makes sense. Why not then use both ShenandoahHeap::[complete_]marking_context() as synonyms for ShehandoahHeap::active_generation()->[complete_]marking_context(). See other related comments in this review round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous suggestion/comments on SH::[complete_]marking_context() as delegating to that method of its gc_generation().
What you have here sounds fine too, but a uniform usage of either keeping SH::[complete_]marking_context() or not at all makes more sense to me, and seems cleaner to me.
Thanks for much for the reviews, I'll probably not add method like |
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the files that changed since the last review only, but can look over all of it once again if necessary (just let me know).
This looks good; just a few small comments, and in particular a somewhat formalistic and pedantic distinction between the use of gc_generation() and active_generation() to fetch the marking context (and the use of global_generation()).
Otherwise looks good to me.
| ShenandoahHeapRegion* region = _regions.next(); | ||
| ShenandoahHeap* heap = ShenandoahHeap::heap(); | ||
| ShenandoahMarkingContext* const ctx = heap->complete_marking_context(); | ||
| ShenandoahMarkingContext* const ctx = heap->global_generation()->complete_marking_context(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as at line 778.
ysramakrishna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for your patience!
🚢
|
thanks all for the reviews and suggestions! /integrate |
|
@pengxiaolong |
|
/sponsor |
|
Going to push as commit aec1fe0.
Your commit was automatically rebased without conflicts. |
|
@ysramakrishna @pengxiaolong Pushed as commit aec1fe0. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
With the JEP 404: Generational Shenandoah implementation, there are generation specific marking completeness flags introduced, and the global marking context completeness flag is not updated at all after initialization, hence the global marking context completeness is not accurate anymore. This may cause expected behavior: ShenandoahHeap::complete_marking_context() should throw assert error if the global marking context completeness flag is false, but now it always return the marking context even it marking is not complete, this may hide bugs where we expect the global/generational marking to be completed.
This change PR fix the bug in global marking context completeness flag, and update all the places using
ShenandoahHeap::complete_marking_context()to use proper API.Test
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23886/head:pull/23886$ git checkout pull/23886Update a local copy of the PR:
$ git checkout pull/23886$ git pull https://git.openjdk.org/jdk.git pull/23886/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 23886View PR using the GUI difftool:
$ git pr show -t 23886Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23886.diff
Using Webrev
Link to Webrev Comment