- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 33.2k
gh-108987: Fix _thread.start_new_thread() race condition #109135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind().
9ecc4f7    to
    a50f9b6      
    Compare
  
    | I'm now trying to confirm manually that my change fix #108987 : this bug is not easy to reproduce in a reliable way. It seems more likely on Windows 32-bit for some reasons. | 
| I'm trying to make the crash more likely by adding a sleep in  diff --git a/Modules/_threadmodule.c b/Modules/_threadmodule.c
index 05bb49756c..39ee06f2f7 100644
--- a/Modules/_threadmodule.c
+++ b/Modules/_threadmodule.c
@@ -1076,6 +1076,13 @@ thread_run(void *boot_raw)
     struct bootstate *boot = (struct bootstate *) boot_raw;
     PyThreadState *tstate = boot->tstate;
 
+    {
+        struct timeval tv;
+        tv.tv_sec = 0;
+        tv.tv_usec = 100 * 1000;
+        select(0, NULL, NULL, NULL, &tv);
+    }
+
     // gh-104690: If Python is being finalized and PyInterpreterState_Delete()
     // was called, tstate becomes a dangling pointer.
     assert(_PyThreadState_CheckConsistency(tstate));On Linux, I'm running these two commands in two different terminals: 
 Also, to fix test_default_timeout() test, I wrote this local fix: diff --git a/Lib/test/lock_tests.py b/Lib/test/lock_tests.py
index a4f52cb20a..1949991fe6 100644
--- a/Lib/test/lock_tests.py
+++ b/Lib/test/lock_tests.py
@@ -1016,13 +1016,17 @@ def test_default_timeout(self):
         """
         # create a barrier with a low default timeout
         barrier = self.barriertype(self.N, timeout=0.3)
-        def f():
-            i = barrier.wait()
+        def thread_run():
+            try:
+                i = barrier.wait()
+            except threading.BrokenBarrierError:
+                return
+
             if i == self.N // 2:
                 # One thread is later than the default timeout of 0.3s.
                 time.sleep(1.0)
             self.assertRaises(threading.BrokenBarrierError, barrier.wait)
-        self.run_threads(f)
+        self.run_threads(thread_run)
 
     def test_single_thread(self):
         b = self.barriertype(1) | 
| Additional patch to attempt to make the crash more likely: diff --git a/Python/pystate.c b/Python/pystate.c
index 09c3538ad7..788c4b2a62 100644
--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -1034,6 +1034,11 @@ PyInterpreterState_Delete(PyInterpreterState *interp)
     _PyObject_FiniState(interp);
 
     free_interpreter(interp);
+
+    struct timeval tv;
+    tv.tv_sec = 0;
+    tv.tv_usec = 100 * 1000;
+    select(0, NULL, NULL, NULL, &tv);
 }
 
  | 
| Ok, I managed to reproduce the crash on Linux in a reliable way. Apply this patch: diff --git a/Modules/_threadmodule.c b/Modules/_threadmodule.c
index 05bb49756c..be40fb4bbf 100644
--- a/Modules/_threadmodule.c
+++ b/Modules/_threadmodule.c
@@ -1076,6 +1076,15 @@ thread_run(void *boot_raw)
     struct bootstate *boot = (struct bootstate *) boot_raw;
     PyThreadState *tstate = boot->tstate;
 
+    {
+        fprintf(stderr, "-- sleep 100 ms before PyEval_AcquireThread() --\n");
+
+        struct timeval tv;
+        tv.tv_sec = 0;
+        tv.tv_usec = 100 * 1000;
+        select(0, NULL, NULL, NULL, &tv);
+    }
+
     // gh-104690: If Python is being finalized and PyInterpreterState_Delete()
     // was called, tstate becomes a dangling pointer.
     assert(_PyThreadState_CheckConsistency(tstate));
diff --git a/Python/pystate.c b/Python/pystate.c
index 09c3538ad7..6392164f17 100644
--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -1034,6 +1034,13 @@ PyInterpreterState_Delete(PyInterpreterState *interp)
     _PyObject_FiniState(interp);
 
     free_interpreter(interp);
+
+    fprintf(stderr, "-- sleep 1 sec after free_interpreter() --\n");
+
+    struct timeval tv;
+    tv.tv_sec = 1;
+    tv.tv_usec = 0;
+    select(0, NULL, NULL, NULL, &tv);
 }
 
 Run this script  import _thread
import os
import sys
NTHREAD = 250
def create_thread():
    _thread.start_new_thread(os.write, (1, b'.'))
threads = [create_thread()
           for _ in range(NTHREAD)]
print("exit")Output on the main branch with Python built in debug mode ( This race condition is now quite obvious to me. I'm not sure why nobody found it earlier. Maybe in the past, threads exited while trying to acquire the GIL, in  | 
| I can reproduce the crash on Python 3.12 using my select() patch above. For this test, I used PR #109133 (not merged into 3.12 yet) to get a more accurate error message. I added the sleep after _PyThreadState_CheckConsistency() in thread_run().  | 
| Now I can also reproduce the bug on Linux with Python 3.11 using the sleep() patch: The important part is to add the sleep at the right place: 
 | 
| @serhiy-storchaka @gpshead: Would you mind to review this fix for an old race condition in  I'm not sure about leaking references on purpose :-( Maybe I can add a basic sync primitive between the  The caller must hold the GIL to call  Avoiding reference leaks with threads... is hard. This change uses  Maybe we can start with this "simple" approach, and later, if it becomes a real issue, consider designing an "even better" approach. I suppose that right now, the most important thing is to prevent Python to leak. Leaking a few references sound "less bad" than crashing the whole process. Python has the old habit of cleaning resources of a thread from another thread: the thread calling Py_Finalize() is responsible to clear all resources of all other threads. I'm thinking about  In a perfect world, a thread should clear its own resources. But well. There are technical challenges. We may want to be able to complete Py_Finalize() as soon as possible, while a daemon thread can be waiting for something unrelated to Python. The current semantics is that daemon threads survive to Python finalization, but "must exit" if they attempt to acquire the GIL after Python finalization. I noticed a similar problem with sub-interpreters, it seems like the main interpreter calling Py_Finalize() does clean resources of other interpreters. Resources ownership is not strictly of each interpreter. The main interpreter seems to like to clean up resources of other interpreters. It makes me feel unconfortable. But again, they are technical challenges. Put {sub-interpreters, nogil, multithreaded applications, signals, processes, multiprocessing, concurrent.futures} in a shaker with ice cubes, shake it for 2 min, serve in a glass, take a sip, and then cry. LOL. cc @colesbury @ericsnowcurrently who may love such crazy bugs as well. | 
| Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.11. | 
| Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12. | 
| Sorry, @vstinner, I could not cleanly backport this to  | 
| Sorry, @vstinner, I could not cleanly backport this to  | 
| Oh, the backport to 3.12 should wait for #109133 to be merged. | 
…n#109135) Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind(). (cherry picked from commit 517cd82)
| GH-109272 is a backport of this pull request to the 3.11 branch. | 
) (#109272) gh-108987: Fix _thread.start_new_thread() race condition (#109135) Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind(). (cherry picked from commit 517cd82)
…n#109135) Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind(). (cherry picked from commit 517cd82)
| GH-110342 is a backport of this pull request to the 3.12 branch. | 
) (#110342) * gh-108987: Fix _thread.start_new_thread() race condition (#109135) Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash. thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time. Move _PyThreadState_CheckConsistency() from thread_run() to _PyThreadState_Bind(). (cherry picked from commit 517cd82) * gh-109795: `_thread.start_new_thread`: allocate thread bootstate using raw memory allocator (#109808) (cherry picked from commit 1b8f236) --------- Co-authored-by: Radislav Chugunov <[email protected]>
| thanks for this, at first glance I believe it to be correct.  (hard to ever say for sure on this kind of thing =) | 
| Sadly, this design "check before use" still has a "short" race condition: #110052 (comment) | 
Fix _thread.start_new_thread() race condition. If a thread is created during Python finalization, the newly spawned thread now exits immediately instead of trying to access freed memory and lead to a crash.
thread_run() calls PyEval_AcquireThread() which checks if the thread must exit. The problem was that tstate was dereferenced earlier in _PyThreadState_Bind() which leads to a crash most of the time.