Skip to content

Conversation

@sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Dec 14, 2022

Fixes: #18345

@sbc100 sbc100 changed the base branch from main to dynlink_logging December 14, 2022 23:58
@sbc100 sbc100 force-pushed the dlopen_sync branch 2 times, most recently from b3ad242 to 9dc6f07 Compare December 15, 2022 00:09
Base automatically changed from dynlink_logging to main December 15, 2022 00:36
@sbc100 sbc100 requested a review from tlively December 15, 2022 00:46
@sbc100 sbc100 requested review from dschuff and kripken December 15, 2022 00:50
@sbc100 sbc100 force-pushed the dlopen_sync branch 3 times, most recently from 10a3684 to e679d51 Compare December 15, 2022 01:02
@dschuff
Copy link
Member

dschuff commented Dec 15, 2022

I wonder if there is a way to test the case where multiple dlopens come from different threads before other threads are synced up. That could also happen coming from the same thread, when using the async dlopen.

@sbc100
Copy link
Collaborator Author

sbc100 commented Dec 15, 2022

I wonder if there is a way to test the case where multiple dlopens come from different threads before other threads are synced up. That could also happen coming from the same thread, when using the async dlopen.

Good idea. it might be hard to setup such a test but certainly worth having. Would be good to prove that such cases don't deadlock.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level, why do we need different syncing code for the main thread vs others? (A dlopen can appear on any thread, so I'd think the situation is symmetric?)

@sbc100 sbc100 force-pushed the dlopen_sync branch 2 times, most recently from 4215213 to 7ed9187 Compare December 15, 2022 20:55
@sbc100 sbc100 force-pushed the dlopen_sync branch 2 times, most recently from 0889437 to 82f3eb6 Compare December 16, 2022 17:28
sbc100 added a commit that referenced this pull request Jan 26, 2023
Prior to this change we were tracking the sequence of `dlopen` events
and replaying them on each thread during `dlsync`.

Now we also track `dlsym` events.  A new data structure is used here to
abstract over these two events (`dlevent`).  During dlsync we then reply
both the `dlopen` and the `dlsym` events.  For `dlsym` events we
serialized them as "dso+index", i.e. which dso did the symbol come from
and what is the index of the symbol within that dso's symbol table.

This is important for #18376.
sbc100 added a commit that referenced this pull request Jan 26, 2023
…18590)

Prior to this change we were tracking the sequence of `dlopen` events
and replaying them on each thread during `dlsync`.

Now we also track `dlsym` events.  A new data structure is used here to
abstract over these two events (`dlevent`).  During dlsync we then reply
both the `dlopen` and the `dlsym` events.  For `dlsym` events we
serialized them as "dso+index", i.e. which dso did the symbol come from
and what is the index of the symbol within that dso's symbol table.

This is important for #18376.
@sbc100 sbc100 force-pushed the dlopen_sync branch 2 times, most recently from 2b9b728 to 944c432 Compare January 28, 2023 05:35
@sbc100
Copy link
Collaborator Author

sbc100 commented Jan 28, 2023

I wonder if there is a way to test the case where multiple dlopens come from different threads before other threads are synced up. That could also happen coming from the same thread, when using the async dlopen.

Since we have single write lock that is held my dlopen, we don't allow any other thread to begin a dlopen of dlsym, whicle there is one outstange. Even async dlopen will still block until is has the write lock. We could try to relax that restriction once we take a another look at the async dlopen interface.

@sbc100 sbc100 force-pushed the dlopen_sync branch 4 times, most recently from f4f10ee to 5d93cc2 Compare January 31, 2023 08:26
@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 3, 2023

I've updated this PR with some extra documentation in Dynamic-Linking.rst.

There are two specific issues that came up when discussed possible deadlock risks.

  1. Possible deadlock due to threads calling atomics.wait directly rather than using our futex wrapper. For this case I've simply updated the docs to describe this caveat, and how to deal with it. We can look into some kind of auto-detection or re-writing solution as a followup and before we move out of the experimental.

  2. Possible deadlock due to malloc taking a inner lock which might block and trigger a dlsync, which itself might trigger a TLS allocation and recurse into malloc. I've convinced myself that, at least to dlmalloc and emmalloc this won't be an issue. The deadlock that @tlively found was due to the call to malloc that removed in dylink: Remove dynamic linking startup code and allocation #18643. The only remaining allocation that can happen during dlsync is the TLS allocation here:

    tls_block = emscripten_builtin_memalign(__builtin_wasm_tls_align(), tls_size);
    . Note that this uses emscripten_builtin_malloc which bypasses things like asan and lsan. The deadlock that thomas found was becuase lsan was taking a lock internally. The non-lsan version of malloc that we have contains only a single outer lock. This means that once that outer malloc lock is taken we don't do any blocking operations until malloc is done. We could looking into trying to detect these possible deadlocks, but I think we can leave that to as followup, especially since this deadlock issue is not introduced by this PR, since Thomas noticed it already in the tree.

Copy link
Member

@tlively tlively left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proofreading docs

Changes to the table are protected by a mutex, and before any thread returns
from ``dlopen`` or ``dlsym`` it will wait until all other threads are sync. In
order to make this synchronization as seamless as possible we hook into the
low level primitives of `emscripten_futex_wait` and `emscirpten_yeild`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
low level primitives of `emscripten_futex_wait` and `emscirpten_yeild`.
low level primitives of `emscripten_futex_wait` and `emscripten_yield`.

While load-time dynamic linking works without any complications, runtime dynamic
linking via ``dlopen``/``dlsym`` can require some extra consideration. The
reason for this is that keeping the indirection function pointer table in sync
between threads has to be done my emscripten library code. Each time a new
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
between threads has to be done my emscripten library code. Each time a new
between threads has to be done by emscripten library code. Each time a new

linking via ``dlopen``/``dlsym`` can require some extra consideration. The
reason for this is that keeping the indirection function pointer table in sync
between threads has to be done my emscripten library code. Each time a new
library is loaded of a new symbol is requested via ``dlsym``, table slots can be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
library is loaded of a new symbol is requested via ``dlsym``, table slots can be
library is loaded or a new symbol is requested via ``dlsym``, table slots can be


Changes to the table are protected by a mutex, and before any thread returns
from ``dlopen`` or ``dlsym`` it will wait until all other threads are sync. In
order to make this synchronization as seamless as possible we hook into the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
order to make this synchronization as seamless as possible we hook into the
order to make this synchronization as seamless as possible, we hook into the

@sbc100 sbc100 merged commit 44b2c2a into main Feb 4, 2023
@sbc100 sbc100 deleted the dlopen_sync branch February 4, 2023 01:32
sbc100 added a commit that referenced this pull request Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In pthreads builds dlopen should not return until all threads have loaded the library

6 participants