Skip to content

Distributed IOError: write: bad address in system call argument (EFAULT) #33899

@oxinabox

Description

@oxinabox

Sometimes one gets EFAULT when sending over a closure over some data.
I don't have a super minimal reproducer, but this is much cut down.

MVE

My full Project.toml and Manifest.toml can be found in this gist.

using Distributed
addprocs(3)
@everywhere using Pkg
@everywhere Pkg.activate(".")
@everywhere using Dates, TimeZones, Intervals

const dts = DateTime(2000,11,10) .+ Hour.(0:24*60)
const zdts = ZonedDateTime.(dts, localzone())
const ivals = Interval.(zdts, zdts.+Hour(1))
const many = rand(ivals, 297_944)  # Need quite a few of these

mkfun(data) =  ii->(ii, Distributed.myid(), sizeof(data))
const fun2 = mkfun(many)

# One of these will likely fail
pmap(fun2, CachingPool(workers()), 1:10)
pmap(fun2, CachingPool(workers()), 1:10)
pmap(fun2, CachingPool(workers()), 1:10)
pmap(fun2, CachingPool(workers()), 1:10)
pmap(fun2, CachingPool(workers()), 1:10)

Error:

Error is:

ERROR: IOError: write: bad address in system call argument (EFAULT)
Stacktrace:
 [1] (::getfield(Base, Symbol("##684#686")))(::Task) at ./asyncmap.jl:178
 [2] foreach(::getfield(Base, Symbol("##684#686")), ::Array{Any,1}) at ./abstractarray.jl:1835
 [3] maptwice(::Function, ::Channel{Any}, ::Array{Any,1}, ::UnitRange{Int64}) at ./asyncmap.jl:178
 [4] #async_usemap#669 at ./asyncmap.jl:154 [inlined]
 [5] #async_usemap at ./none:0 [inlined]
 [6] #asyncmap#668 at ./asyncmap.jl:81 [inlined]
 [7] #asyncmap at ./none:0 [inlined]
 [8] #pmap#213(::Bool, ::Int64, ::Nothing, ::Array{Any,1}, ::Nothing, ::Function, ::Function, ::CachingPool, ::UnitRange{Int64}) at /User
s/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.0/Distributed/src/pmap.jl:126
 [9] pmap(::Function, ::CachingPool, ::UnitRange{Int64}) at /Users/julia/buildbot/worker/package_macos64/build/usr/share/julia/stdlib/v1.
0/Distributed/src/pmap.jl:101
 [10] top-level scope at none:0

Versions

So far I have only reproduced this on the LTS.
It might happen without that
But I have it on Mac and on Linux.

julia> versioninfo()
Julia Version 1.0.5
Commit 3af96bcefc (2019-09-09 19:06 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin18.6.0)
  CPU: Intel(R) Core(TM) i7-8559U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ioInvolving the I/O subsystem: libuv, read, write, etc.parallelismParallel or distributed computation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions