[mono] Deadlock on TypeInitializationLock in mono_runtime_class_init_full

We've seen a deadlock in roslyn when performing a source-build of rc2 natively on ppc64le (using the Mono runtime).  In a nutshell, it seems the logic behind this block code code in `mono_runtime_class_init_full` is not thread-safe:
```
                /* see if the thread doing the initialization is already blocked on this thread */
                gboolean is_blocked = TRUE;
                blocked = GUINT_TO_POINTER (MONO_NATIVE_THREAD_ID_TO_UINT (lock->initializing_tid));
                while ((pending_lock = (TypeInitializationLock*) g_hash_table_lookup (blocked_thread_hash, blocked))) {
                        if (mono_native_thread_id_equals (pending_lock->initializing_tid, tid)) {
                                if (!pending_lock->done) {
                                        mono_type_initialization_unlock ();
                                        goto return_true;
                                } else {
                                        /* the thread doing the initialization is blocked on this thread,
                                           but on a lock that has already been freed. It just hasn't got
                                           time to awake */
                                        is_blocked = FALSE;
                                        break;
                                }
                        }
                        blocked = GUINT_TO_POINTER (MONO_NATIVE_THREAD_ID_TO_UINT (pending_lock->initializing_tid));
                }
 ```

To trigger the race, it is necessary that two threads each are in `mono_runtime_class_init_full` twice.  This can happen since `mono_runtime_class_init_full` performs `mono_runtime_try_invoke`, which executes some unknown managed code, which in turn can trigger a recursive `mono_runtime_class_init_full` call.   If both threads try to initialize the same two classes X and Y, but in reverse order (i.e. thread A accesses Y in the .cctor of X while thread B accesses X in the .cctor of Y), we may end up in a deadlock.

To trigger the deadlock, it seems that a third class Z also has to be involved.  That can lead to erroneous logic in the code above: Assume thread A is working both on X and (recursively) Z, and after Z is done tries to work on Y.  But thread B has already started working on Y and is now recursively blocked on Z before then also requiring X.    Since thread A sees that thread B is blocked on Z, which it itself has already completed, the code above triggers:
```
                                        /* the thread doing the initialization is blocked on this thread,
                                           but on a lock that has already been freed. It just hasn't got
                                           time to awake */
```
However, while it is true that thread B will indeed get woken up from Z's lock, that does not mean it will successfully complete initialization of Y as the Y's .cctor also needs X - where thread B will get again blocked on thread A after all.

In the example we're seeing, those nested initializations happen on various instantiations of `Microsoft.CodeAnalysis.VisualBasic.Symbols.OverrideHidingHelper`, which has this .cctor:
```
        Shared Sub New()
            OverrideHidingHelper(Of MethodSymbol).s_runtimeSignatureComparer = MethodSignatureComparer.RuntimeMethodSignatureComparer
            OverrideHidingHelper(Of PropertySymbol).s_runtimeSignatureComparer = PropertySignatureComparer.RuntimePropertySignatureComparer
            OverrideHidingHelper(Of EventSymbol).s_runtimeSignatureComparer = EventSignatureComparer.RuntimeEventSignatureComparer
        End Sub
```

A detailed timeline for a possible deadlock is as follows.  (I cannot guarantee that this timeline is *exacty* what happened since I can only examine the deadlocked state.  But it is a possibility how we could have gotten there.)

```
Thread A                                             Thread B

  mono_runtime_class_init_full (MonoVTable X)          mono_runtime_class_init_full (MonoVTable Y)
   mono_type_initialization_lock
   lookup X in type_initialization_hash: n/a
   allocate TypeInitializationLock LX
   insert into type_initialization_hash: X -> LX
   mono_type_initialization_unlock
                                                         mono_type_initialization_lock
                                                         lookup Y in type_initialization_hash: n/a
                                                         allocate TypeInitializationLock LY
                                                         insert into type_initialization_hash: Y -> LY
                                                         mono_type_initialization_unlock

   mono_runtime_try_invoke (X .cctor)                    mono_runtime_try_invoke (Y .cctor)
     ... managed code ...                                   ... managed code ...
     mono_runtime_class_init_full (MonoVTable Z)            mono_runtime_class_init_full (MonoVTable Z)
       mono_type_initialization_lock
       lookup Z in type_initialization_hash: n/a
       allocate TypeInitializionLock LZ
       insert into type_initialization_hash: Z -> LZ
       mono_type_initialization_unlock
                                                              mono_type_initialization_lock
                                                              lookup Z in type_initialization_hash: LZ
                                                              LZ->initializing_tid is thread A
                                                              lookup A in blocked_thread_hash: n/a
                                                              insert into blocked_thread_hash: B -> LZ
                                                              mono_type_initialization_unlock
                                                              wait on LZ
       [... complete initialization of Z ...]
       LZ->done = TRUE;
       wakeup waiters on LZ
     mono_runtime_class_init_full (MonoVTable Z) returns
     ... managed code ...
     mono_runtime_class_init_full (MonoVTable Y)
       mono_type_initialization_lock
       lookup Y in type_initialization_hash: LY
       LY->initializing_tid is thread B
       lookup B in blocked_thread_hash: LZ
       LZ->initializing_tid is thread A
       LZ->done is TRUE
         /* the thread doing the initialization is blocked on this thread,
            but on a lock that has already been freed. It just hasn't got
            time to awake */
       mono_type_initialization_unlock
       wait on LY
                                                              wakeup from LZ
                                                              mono_type_initialization_lock
                                                              remove from blocked_thread_hash: B-> LZ
                                                              remove from type_initialization_hash: Z -> LZ
                                                              mono_type_initialization_unlock
                                                            mono_runtime_class_init_full (MonoVTable Z) returns
                                                            ... managed code ...
                                                            mono_runtime_class_init_full (MonoVTable X)
                                                              mono_type_initialization_lock
                                                              lookup X in type_initialization_hash: LX
                                                              LX->initializing_tid is thread A
                                                              lookup A in blocked_thread_hash: n/a
                                                              insert into blocked_thread_hash: B -> LX
                                                              mono_type_initialization_unlock
                                                              wait on LX
```

CC - @directhex @lambdageek @vargaz @akoeplinger
FYI - @giritrivedi @alhad-deshpande @janani66 @omajid @tmds

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mono] Deadlock on TypeInitializationLock in mono_runtime_class_init_full #93778

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[mono] Deadlock on TypeInitializationLock in mono_runtime_class_init_full #93778

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions