Skip to content

Commit e41786c

Browse files
Davies LiuJoshRosen
authored andcommitted
[SPARK-4088] [PySpark] Python worker should exit after socket is closed by JVM
In case of take() or exception in Python, python worker may exit before JVM read() all the response, then the write thread may raise "Connection reset" exception. Python should always wait JVM to close the socket first. cc JoshRosen This is a warm fix, or the tests will be flaky, sorry for that. Author: Davies Liu <[email protected]> Closes #2941 from davies/fix_exit and squashes the following commits: 9d4d21e [Davies Liu] fix race
1 parent 9530316 commit e41786c

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

python/pyspark/daemon.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -62,8 +62,7 @@ def worker(sock):
6262
exit_code = compute_real_exit_code(exc.code)
6363
finally:
6464
outfile.flush()
65-
if exit_code:
66-
os._exit(exit_code)
65+
return exit_code
6766

6867

6968
# Cleanup zombie children
@@ -160,10 +159,13 @@ def handle_sigterm(*args):
160159
outfile.flush()
161160
outfile.close()
162161
while True:
163-
worker(sock)
164-
if not reuse:
162+
code = worker(sock)
163+
if not reuse or code:
165164
# wait for closing
166-
while sock.recv(1024):
165+
try:
166+
while sock.recv(1024):
167+
pass
168+
except Exception:
167169
pass
168170
break
169171
gc.collect()

0 commit comments

Comments
 (0)