Skip to content

process_exited(p) returns false for <defunct> zombie object on macos #54145

@NHDaly

Description

@NHDaly

I'm not sure how I've ended up with a zombie process; this probably indicates a bug somewhere else in our package, but it does seem like this shouldn't be possible.

However, we have a loop that looks like this, and it's stuck in an infinite loop:

        while !process_exited(proc)
            Core.println(getpid(proc))
            Core.println(process_exited(proc))
            line = readline(proc)
            # ... do stuff with line ...
        end

I can see that it's stuck there, thanks to ^t:

======================================================================================
Information request received. A stacktrace will print followed by a 1.0 second profile
======================================================================================

signal (29): Information request: 29
ijl_array_to_string at /Users/nathandaly/builds/julia-1.10/src/array.c:486
#readline#443 at ./array.jl:0
readline at ./io.jl:560 [inlined]
redirect_worker_output at /Users/nathandaly/.julia/dev/ReTestItems/src/workers.jl:216
#8 at /Users/nathandaly/.julia/dev/ReTestItems/src/workers.jl:190
unknown function (ip: 0x119764097)
_jl_invoke at /Users/nathandaly/builds/julia-1.10/src/gf.c:0 [inlined]
ijl_apply_generic at /Users/nathandaly/builds/julia-1.10/src/gf.c:3076
jl_apply at /Users/nathandaly/builds/julia-1.10/src/./julia.h:1982 [inlined]
start_task at /Users/nathandaly/builds/julia-1.10/src/task.c:1238
unknown function (ip: 0x0)

and indeed, with the newly added Core.printlns, it's printing this:

61300
false
61300
false
...

However, the child process that it's reading from is dead, but it's somehow wound up as a zombie:

nathandaly       61300   0.0  0.0        0      0   ??  Z     5:53PM   0:00.00 <defunct>

It printed this message before it called exit(0):

  Worker 61300:  ┌ Debug: Shutting down worker 61300
  Worker 61300:  └ @ ReTestItems.Workers ~/.julia/dev/ReTestItems/src/workers.jl:321

I'm not sure why it's a zombie, since the parent process should be checking its return status. And I'm not sure why neither process_exited(proc) nor readline(proc) are detecting that the process has ended.
Annoyingly, sticking an @info in the loop jostles whatever race condition is happening, and it doesn't get stuck.


System report:

julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8 (2024-03-01 10:14 UTC)
Build Info:

    Note: This is an unofficial build, please report bugs to the project
    responsible for this build and not to the Julia project unless you can
    reproduce the issue using official builds available at https://julialang.org/downloads

Platform Info:
  OS: macOS (arm64-apple-darwin23.4.0)
  CPU: 12 × Apple M2 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
  JULIA_SSL_CA_ROOTS_PATH = 

Metadata

Metadata

Assignees

No one assigned

    Labels

    ioInvolving the I/O subsystem: libuv, read, write, etc.system:macAffects only macOS

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions