-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
when N * sizeof(T) == 64.
I am showing results on a (4-day-old) master below, but I see the same behaviour on Julia 1.0.3:
julia> versioninfo()
Julia Version 1.2.0-DEV.12
Commit 77a7d92e91 (2018-12-13 21:20 UTC)
Platform Info:
OS: Linux (x86_64-redhat-linux)
CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libimf
LLVM: libLLVM-6.0.1 (ORCJIT, skylake)
julia> bigvec() = ntuple(i -> Core.VecElement(1.0), Val(16))
bigvec (generic function with 1 method)
julia> twovecs() = (ntuple(i -> Core.VecElement(1.0), Val(8)),ntuple(i -> Core.VecElement(1.0), Val(8)))
twovecs (generic function with 1 method)
julia> bigvec()
(VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0))
julia> twovecs()
((VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0)), (VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0), VecElement{Float64}(1.0)))
julia> bigvec() |> typeof |> Base.datatype_alignment
16
julia> twovecs() |> typeof |> Base.datatype_alignment
16
julia> bigvec() |> typeof |> sizeof
128
julia> twovecs() |> typeof |> sizeof
128
julia> using BenchmarkTools
julia> @benchmark bigvec()
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 4607182418800017408
--------------
minimum time: 0.015 ns (0.00% GC)
median time: 0.017 ns (0.00% GC)
mean time: 0.018 ns (0.00% GC)
maximum time: 2.991 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark twovecs()
signal (11): Segmentation fault
in expression starting at no file:0
#_run#18 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:336
unknown function (ip: 0x7f3f05708efb)
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
inner at ./none:0
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1571 [inlined]
jl_f__apply at /home/chriselrod/Documents/languages/jdev/src/builtins.c:556
jl_f__apply_latest at /home/chriselrod/Documents/languages/jdev/src/builtins.c:594
#invokelatest#1 at ./essentials.jl:746 [inlined]
#invokelatest at ./none:0 [inlined]
#run_result#16 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:32 [inlined]
#run_result at ./none:0 [inlined]
#run#18 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:46
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
#run at ./none:0 [inlined]
#run at ./none:0 [inlined]
#warmup#21 at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:79 [inlined]
warmup at /home/chriselrod/.julia/packages/BenchmarkTools/dtwnm/src/execution.jl:79
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
do_call at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:323
eval_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:411
eval_stmt_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:362 [inlined]
eval_body at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:759
jl_interpret_toplevel_thunk_callback at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f3f162af98f)
unknown function (ip: 0x7)
jl_interpret_toplevel_thunk at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:894
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:764
jl_toplevel_eval_in at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:793
eval at ./boot.jl:328
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
eval_user_input at /home/chriselrod/Documents/languages/jdev/usr/share/julia/stdlib/v1.2/REPL/src/REPL.jl:85
run_backend at /home/chriselrod/.julia/packages/Revise/gStbk/src/Revise.jl:771
#58 at ./task.jl:259
jl_fptr_trampoline at /home/chriselrod/Documents/languages/jdev/src/gf.c:1854
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2209
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1571 [inlined]
start_task at /home/chriselrod/Documents/languages/jdev/src/task.c:572
unknown function (ip: 0xffffffffffffffff)
Allocations: 25427851 (Pool: 25422691; Big: 5160); GC: 64
Segmentation fault (core dumped)When I use a tuple of 2 8-length (64-total-byte) vectors I get a segmentation fault.
When I use a tuple of 2 4-length (32-total-byte) vectors I get no segmentation fault.
It is the number of bytes that matters. When using Float32, I can reproduce with no-segfault (but incorrectly reported allocations) NTuple{32,...}, but segfault on NTuple{2,NTuple{16,...}}.
I can reproduce the segmentation faults on both Ryzen and Skylake-X. I can try Haswell later.
Although this is obviously of more interest on Skylake-X (and other avx-512) architectures: because tuples of 32-byte vectors don't cause segfaults, I'd just not construct these tuples.
The workaround -- concatenating and then sub-setting larger vectors -- is a little awkward.