Skip to content

Kernel launch failure when address is in use #131

@parente

Description

@parente

Rarely when running the jupyter/kernel_gateway test suite, the tests fail with the following exception due to a race condition between two kernels using the same port:

Traceback (most recent call last):
  File "/opt/python/3.3.5/lib/python3.3/runpy.py", line 160, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/opt/python/3.3.5/lib/python3.3/runpy.py", line 73, in _run_code
    exec(code, run_globals)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/ipykernel/__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/traitlets/config/application.py", line 595, in launch_instance
    app.initialize(argv)
  File "<decorator-gen-122>", line 2, in initialize
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/traitlets/config/application.py", line 74, in catch_config_error
    return method(app, *args, **kwargs)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/ipykernel/kernelapp.py", line 412, in initialize
    self.init_sockets()
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/ipykernel/kernelapp.py", line 245, in init_sockets
    self.init_iopub(context)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/ipykernel/kernelapp.py", line 250, in init_iopub
    self.iopub_port = self._bind_socket(self.iopub_socket, self.iopub_port)
  File "/home/travis/virtualenv/python3.3.5/lib/python3.3/site-packages/ipykernel/kernelapp.py", line 174, in _bind_socket
    s.bind("tcp://%s:%i" % (self.ip, port))
  File "zmq/backend/cython/socket.pyx", line 487, in zmq.backend.cython.socket.Socket.bind (zmq/backend/cython/socket.c:5156)
  File "zmq/backend/cython/checkrc.pxd", line 25, in zmq.backend.cython.checkrc._check_rc (zmq/backend/cython/socket.c:7535)
zmq.error.ZMQError: Address already in use

IIRC, the port assignment is requested by a client (e.g., via a connection file). Having the kernel retry on another port means the client won't be able to communicate with that kernel.

Is this behavior spec'ed somewhere? Is it the client's responsibility to retry (e.g., this is why I see the rare "kernel has died, restarting" flow sometimes when first opening a notebook?) Can the behavior be improved?

/cc @minrk

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions