-
Notifications
You must be signed in to change notification settings - Fork 174
Description
Environment data
- debugpy version: 1.8.8
- OS and version: A k8s pod running an Ubuntu 20.04.6 based container
- Python version (& distribution if applicable, e.g. Anaconda): 3.9
- Using VS Code or Visual Studio: VS Code
Actual behavior
I'm using the Ray Distributed Debugger (their code here) with Ray on K8S. It runs debugpy.listen , but when I check the port on which it listens, nothing is bound to that port (sudo lsof -i :$LISTEN_PORT). I enabled DEBUGPY_LOG_DIR to get more detailed logs, and I noticed that debugpy.pydevd.NNNN.log contains this near the end, indicating that it indeed crashed:
Traceback (most recent call last):
File "/my_app/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_comm.py", line 422, in _on_run
cmd.send(self.sock)
File "/my_app/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_net_command.py", line 109, in send
sock.sendall(as_bytes)
BrokenPipeError: [Errno 32] Broken pipe
I looked in the issue trackers of debugpy, pydevd, and ray, and did some googling, and couldn't find much unfortunately. The only thing I found is that this may point to the connection between the local services (there is a client, server, and "debug server" and some incoming client (?) involved in running debugpy on the application side, it seems) breaking. I found this snippet in debugpy.adapter.NNNN.log:
I+00000.071: Listening for incoming Client connections on 10.40.0.130:51507...
I+00000.071: Listening for incoming Server connections on 127.0.0.1:39415...
I+00000.071: Sending endpoints info to debug server at localhost:60997:
{
"client": {
"host": "10.40.0.130",
"port": 51507
},
"server": {
"host": "127.0.0.1",
"port": 39415
}
}
I+00000.076: Accepted incoming Server connection from 127.0.0.1:43864.
Lastly, I noticed this in debugpy.{adapter,server}.NNNN.log but that seems to be ok, as I also saw this in healthy local runs:
I+00000.049: Error while enumerating installed packages.
Traceback (most recent call last):
File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 362, in get_environment_description
report(" {0}=={1}\n", pkg.name, pkg.version)
AttributeError: 'PathDistribution' object has no attribute 'name'
Stack where logged:
File "/my_app/python3_x86_64/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/my_app/python3_x86_64/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/my_app/debugpy/adapter/__main__.py", line 227, in <module>
main(_parse_argv(sys.argv))
File "/my_app/debugpy/adapter/__main__.py", line 50, in main
log.describe_environment("debugpy.adapter startup environment:")
File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 372, in describe_environment
info("{0}", get_environment_description(header))
File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 364, in get_environment_description
swallow_exception(
File "/my_app/debugpy/adapter/../../debugpy/common/log.py", line 215, in swallow_exception
_exception(format_string, *args, **kwargs)
All of this crashes already before I try connecting to the debugger.
I was also able to reproduce this without using Ray Distributed Debugger. I just connect to the k8s pod, create a small python script:
import debugpy
debugpy.listen(5678)
print("before wait_for_client")
debugpy.wait_for_client()
print("after wait_for_client")
print("before breakpoint")
debugpy.breakpoint()
print("after breakpoint")
Run it and check the log files and see the same crash happening (BrokenPipeError: [Errno 32] Broken pipe) in the pydevd logs.
When I run all of this locally, everything works fine. When running on ray on k8s, I run into this issue...
These are the full, lightly redacted, logs:
Questions:
- Is there a way to detect a crashed listen from code? If so, how?
- Any ideas on what makes this crash?
Expected behavior
Accepting oncoming connections on the debugpy.listen endpoint.
Steps to reproduce:
I'm afraid it will be hard to reproduce this in an environment other than our "ray on k8s" setup. But details are in the "Actual behavior" section.