-
-
Notifications
You must be signed in to change notification settings - Fork 33.4k
gh-75880: Deadlocks in concurrent.futures.ProcessPoolExecutor with unpickling error
#4256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f72d0fd to
51b411d
Compare
|
The windows failures are unrelated to this PR ( |
51b411d to
4cf2a47
Compare
4cf2a47 to
a70485b
Compare
a70485b to
b7422fe
Compare
ogrisel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't what the best solution is but this PR should not be merged while the following regression is not addressed.
| work_id, exception=_ExceptionWithTraceback(e, e.__traceback__) | ||
| )) | ||
| result_queue._put_bytes(work_id.to_bytes(WORK_ID_SIZE, WORK_ID_ENC) + | ||
| serialize_res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bytes concatenation triggers a potentially very large memory copy which is a regression compared to master.
There are two possible solutions to this problem:
- Add a
send_byte_chunksAPI down to the Connection that would make it possible to send a topple of bytes without materializing the full buffer. We don't know if it is possible on windows. - Implement a new Queue sub class to send and receive 2 messages at a time. This would introduce a small call overhead, potentially critical for small messages.
concurrent.futures.ProcessPoolExecutor with unpickling errorconcurrent.futures.ProcessPoolExecutor with unpickling error
|
https://bugs.python.org/issue31699 is closed. What is the status of this PR? |
|
This PR is stale because it has been open for 30 days with no activity. |
concurrent.futures.ProcessPoolExecutor with unpickling errorconcurrent.futures.ProcessPoolExecutor with unpickling error
|
This PR is stale because it has been open for 30 days with no activity. |
When using
concurrent.futures.ProcessPoolExecutorwith objects that are not picklable or unpicklable, several situations results in a deadlock, with the interpreter frozen.This is the case for different scenarios, for instance, https://gist.github.com/tomMoral/cc27a938d669edcf0286c57516942369. This PR is a follow up of #3895 and specifically fixes the unpickling behavior. With this PR, the unpickling failures do not result in broken executors. This is done by ensuring that the
_callItemand the_ResultItemare sent in queues along the work_id with a specific communication protocol, which sends thework_idas bytes, followed by the object serialized bypickle.To this end, we introduce private API in
multiprocessing.Queueto allow modified serialization protocols.Overall, the goal is to make
concurrent.futures.ProcessPoolExecutormore robust to faulty user code.This work was done as part of the tommoral/loky#48 with the intent to re-use the executor in multiple independent parts of the program, in collaboration with @ogrisel. See #1013 for more the details.
https://bugs.python.org/issue31699
concurrent.futures.ProcessPoolExecutorwith pickling error #75880