Skip to content

Conversation

@wks
Copy link
Contributor

@wks wks commented Sep 18, 2021

This PR fixes the race bug during stop-the-world.

This PR has two associated PRs:

In short, the bug exists because the MMTk coordinator calls SafepointSynchronnize::begin() directly. This races with the real VM thread.

Fixing this involves introducing a dedicated "companion thread". This thread receives stop-the-world requests from the MMTk core. It will let the VM thread execute a VM Operation which does nothing but waiting for another start-the-world request from the MMTk. In this way, mmtk-core can request stop-the-world, then continue to do GC, then request start-the-world.

Concretely, the changes include:

  • openjdk: The SafepointSynchronize::begin() method no longer reports which threads have stopped, because this method is used in all VMOperations that needs STW, not just GC.
  • openjdk: Added a new VMOperation identifier VMOp_ThirdPartyHeapOperation.
  • mmtk-openjdk: Introduce a MMTkVMCompanionThread. This is a new non-Java thread. The only thing it does, upon request from MMTk core, is to create a VM_MMTkSTWOperation and call VMThread::execute to execute it.
    The VM Thread then stops the world, executes the VM_MMTkSTWOperation. It finishes when the mmtk-core requests start-the-world. Then the VM Thread will start the world and unblock the companion thread
    The reason why we introduce a companion thread is that VMThread::execute will block the caller until the VM thread finished executing the VM operation. There is an alternative. We could use "asynchronous" (non-blocking) VM operations, but this feature is removed in a later OpenJDK version.
  • mmtk-core: Because now the only thread that calls SafepointSynchronize::begin() and SafepointSynchronize::end() is the real VM thread, we no longer need to let the coordinator request stop/start-the-world.

There could be some performance impact because of the companion thread we introduced, so performance tests are needed. Some preliminary tests show no obvious performance regression.

Merging strategy

All three repositories should be updated at the same time.

If this is impossible, we should update mmtk-core and openjdk first. Then apply code changes to mmtk-openjdk and, at the same time, update its dependencies so that it depends on the updated mmtk-core and openjdk.

@wks wks force-pushed the fix-safepoint-sync2 branch from 1224056 to e618940 Compare September 22, 2021 13:05
@wks wks marked this pull request as ready for review September 22, 2021 13:06
Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Previously the MMTk coordinator thread pretents to be the VM thread and
calls SafepointSynchronize::begin() directly.  This makes it race with
the real VM thread.

This commit introduces a dedicated "VM companion thread". The companion
therad requests the VM thread to execute a stop-the-world VM operation
(a VMOp_ThirdPartyHeapOperation), and the VM operation waits for the
stop-the-world GC to finish and lets the VM thread start the world
again.  This ensures the real VM thread is the only thread that calls
SafepointSynchronize::begin().  The MMTk binding initiates stack
scanning during stop-the-world.

Also made minor fixes to MMTKCollectorThread and MMTKContextThread

-   Removed is_VM_thread and is_GC_thread methods.
    -   It is not the VM thread.
    -   The `is_GC_thread` method is never used in the real VM thread,
        and is removed in the upstream in a later revision.
-   Initialize their native thread names on start.
    -   The native names are visible in GDB when debugging.
    -   Prepended  "MMTk" to their names to make them more obvious.
@wks wks force-pushed the fix-safepoint-sync2 branch from e618940 to 7b4d068 Compare September 24, 2021 05:24
@wks
Copy link
Contributor Author

wks commented Sep 24, 2021

I made some updates. Added comments, removed stale code, and added "MMTk" before the names of the context and the collector threads.

@qinsoon qinsoon merged commit f92358d into mmtk:master Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants