Skip to content

Design: thread shutdown handlers for proxied work #18631

@tlively

Description

@tlively

Problem: The current machinery for proxying work to threads does not allow for thread shutdown to be handled gracefully. If a message arrives after thread shutdown, there is no known way to notify the sender because we do not know at that point who the sender was. Even if we did know the identity of the sender, there would be no good way to notify them of the shutdown. We could postMessage the sender, but they may be synchronously waiting for their original message to be processed, so they may not receive the cancellation message. We couldn't notify the sender with Atomics.notify either because we wouldn't know whether they are waiting for such a notification and we don't know what address and value to use for the notification even if they were.

We currently handle this lack of shutdown notifications by calling it user error if a thread dies while work is being proxied to it. This is sufficient for systems like WasmFS that can guarantee that the proxy target will not unexpectedly die, for example because it never runs user code. This is insufficient, however, for systems that need to proxy work to arbitrary user threads, such as the dynamic loader.

Requirements: Since senders may be waiting for their messages to be processed in different ways, they need to be able to specify how they should be notified of thread shutdown. This implies that each message should provide arbitrary code to be run if the target thread starts shutting down before the message can be processed, which means that the C stack still needs to exist on the target thread when the messages' shutdown handlers are run. It should not be possible for new messages to arrive later in shutdown and remain unhandled.

Design: The following is a sketch of a design that adds graceful shutdown handling as a layer underneath the current proxying system.

  1. A pointer to an em_task_queue is added to struct pthread. This task queue is the pthread's "mail box." The mail box is "closed" if the pointer is null and "open" if it points to a valid em_task_queue.

  2. To send a message to a thread, atomically check that its mail box is open and enqueue a message on it. If the mail box is closed, the target thread is shutting down or has already shut down, so return an error immediately. If the enqueue succeeds, notify the thread with a postMessage via the messageRelay or with an Atomics.notify if the target thread is waiting with Atomics.waitAsync.

  3. A message is the typical function pointer and void* user data and an additional function pointer that will be called to handle a shutdown that begins before the message can be processed. The normal message function pointer will always be called directly from the JS event loop, but the shutdown handler may be called with arbitrary user code on the stack and should only perform trivial work like sending a postMessage, setting a flag, or notifying a waiter.

  4. On thread shutdown, the mail box is atomically closed. This ensures that no further messages will be enqueued. The thread then dequeues the messages already in its closed mail box and calls their shutdown handlers.

  5. (Optional) To prevent TOCTOU bugs where messages may be proxied to threads that have already finished shutting down, pthread structs could be placed into quarantine and reused after a quarantine size threshold has been reached rather than immediately freed. This would make it possible for the proxying code to find the closed mail box and return an error immediately without ever dereferencing unallocated memory.

cc @sbc100 how does this sound? I plan to add new error handling and reporting to the proxying.h API building on top of the functionality described here because that sounds generally useful. Do you think you'd want to continue building on the proxying.h API for the dynamic loader or would you want to build directly on this lower-level API?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions