-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Block in dlopen until all threads have loaded the module #18376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b3ad242 to
9dc6f07
Compare
10a3684 to
e679d51
Compare
|
I wonder if there is a way to test the case where multiple dlopens come from different threads before other threads are synced up. That could also happen coming from the same thread, when using the async dlopen. |
Good idea. it might be hard to setup such a test but certainly worth having. Would be good to prove that such cases don't deadlock. |
kripken
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high level, why do we need different syncing code for the main thread vs others? (A dlopen can appear on any thread, so I'd think the situation is symmetric?)
4215213 to
7ed9187
Compare
0889437 to
82f3eb6
Compare
Prior to this change we were tracking the sequence of `dlopen` events and replaying them on each thread during `dlsync`. Now we also track `dlsym` events. A new data structure is used here to abstract over these two events (`dlevent`). During dlsync we then reply both the `dlopen` and the `dlsym` events. For `dlsym` events we serialized them as "dso+index", i.e. which dso did the symbol come from and what is the index of the symbol within that dso's symbol table. This is important for #18376.
…18590) Prior to this change we were tracking the sequence of `dlopen` events and replaying them on each thread during `dlsync`. Now we also track `dlsym` events. A new data structure is used here to abstract over these two events (`dlevent`). During dlsync we then reply both the `dlopen` and the `dlsym` events. For `dlsym` events we serialized them as "dso+index", i.e. which dso did the symbol come from and what is the index of the symbol within that dso's symbol table. This is important for #18376.
2b9b728 to
944c432
Compare
Since we have single write lock that is held my dlopen, we don't allow any other thread to begin a dlopen of dlsym, whicle there is one outstange. Even async dlopen will still block until is has the write lock. We could try to relax that restriction once we take a another look at the async dlopen interface. |
543b474 to
6f3c04f
Compare
f4f10ee to
5d93cc2
Compare
7cd10f4 to
7cf1b0d
Compare
|
I've updated this PR with some extra documentation in There are two specific issues that came up when discussed possible deadlock risks.
|
tlively
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
proofreading docs
| Changes to the table are protected by a mutex, and before any thread returns | ||
| from ``dlopen`` or ``dlsym`` it will wait until all other threads are sync. In | ||
| order to make this synchronization as seamless as possible we hook into the | ||
| low level primitives of `emscripten_futex_wait` and `emscirpten_yeild`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| low level primitives of `emscripten_futex_wait` and `emscirpten_yeild`. | |
| low level primitives of `emscripten_futex_wait` and `emscripten_yield`. |
| While load-time dynamic linking works without any complications, runtime dynamic | ||
| linking via ``dlopen``/``dlsym`` can require some extra consideration. The | ||
| reason for this is that keeping the indirection function pointer table in sync | ||
| between threads has to be done my emscripten library code. Each time a new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| between threads has to be done my emscripten library code. Each time a new | |
| between threads has to be done by emscripten library code. Each time a new |
| linking via ``dlopen``/``dlsym`` can require some extra consideration. The | ||
| reason for this is that keeping the indirection function pointer table in sync | ||
| between threads has to be done my emscripten library code. Each time a new | ||
| library is loaded of a new symbol is requested via ``dlsym``, table slots can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| library is loaded of a new symbol is requested via ``dlsym``, table slots can be | |
| library is loaded or a new symbol is requested via ``dlsym``, table slots can be |
|
|
||
| Changes to the table are protected by a mutex, and before any thread returns | ||
| from ``dlopen`` or ``dlsym`` it will wait until all other threads are sync. In | ||
| order to make this synchronization as seamless as possible we hook into the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| order to make this synchronization as seamless as possible we hook into the | |
| order to make this synchronization as seamless as possible, we hook into the |
This was noticed while working on #18376
Fixes: #18345