Skip to content

Bug: server (at least) craches using VULKAN #7769

@metal3d

Description

@metal3d

What happened?

Hi,
Compiled server using VULKAN backend (as OpenCL was removed :sad:), I can start a server with a model. But, as far as an inference is requested, I've got an error message.

PS: Vulkan is theoretically not intended to replace OpenCL

GGML_ASSERT: /home/metal3d/Projects/ML/llama.cpp/ggml-vulkan.cpp:4069: d_D != nullptr
[New LWP 787893]
[New LWP 787894]
[New LWP 787895]
[New LWP 787897]
[New LWP 787900]
[New LWP 787917]
[New LWP 787918]
[New LWP 787919]
[New LWP 787924]
[New LWP 787925]
[New LWP 787926]
[New LWP 787927]
[New LWP 787928]
[New LWP 787929]
[New LWP 787930]
[New LWP 787931]

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#0  0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#1  0x00000000005d811b in ggml_print_backtrace ()
#2  0x000000000067dafc in void ggml_vk_op_f32<vk_op_unary_push_constants>(ggml_backend_vk_context*, vk_context*, ggml_tensor const*, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, ggml_op, vk_op_unary_push_constants const&&) [clone .constprop.0] ()
#3  0x00000000006a1507 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) ()
#4  0x000000000062aad4 in ggml_backend_sched_graph_compute_async ()
#5  0x000000000056acb9 in llama_decode_internal(llama_context&, llama_batch) [clone .isra.0] ()
#6  0x000000000056c949 in llama_decode ()
#7  0x00000000004e0171 in server_context::update_slots() ()
#8  0x00000000004b1978 in server_queue::start_loop() ()
#9  0x00000000004552cc in main ()

Name and Version

version adc9ff3

Sorry: 2b33896

What operating system are you seeing the problem on?

Linux Fedora 40

Relevant log output

GGML_ASSERT: /home/metal3d/Projects/ML/llama.cpp/ggml-vulkan.cpp:4069: d_D != nullptr
[New LWP 787893]
[New LWP 787894]
[New LWP 787895]
[New LWP 787897]
[New LWP 787900]
[New LWP 787917]
[New LWP 787918]
[New LWP 787919]
[New LWP 787924]
[New LWP 787925]
[New LWP 787926]
[New LWP 787927]
[New LWP 787928]
[New LWP 787929]
[New LWP 787930]
[New LWP 787931]

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#0  0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#1  0x00000000005d811b in ggml_print_backtrace ()
#2  0x000000000067dafc in void ggml_vk_op_f32<vk_op_unary_push_constants>(ggml_backend_vk_context*, vk_context*, ggml_tensor const*, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, ggml_op, vk_op_unary_push_constants const&&) [clone .constprop.0] ()
#3  0x00000000006a1507 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) ()
#4  0x000000000062aad4 in ggml_backend_sched_graph_compute_async ()
#5  0x000000000056acb9 in llama_decode_internal(llama_context&, llama_batch) [clone .isra.0] ()
#6  0x000000000056c949 in llama_decode ()
#7  0x00000000004e0171 in server_context::update_slots() ()
#8  0x00000000004b1978 in server_queue::start_loop() ()
#9  0x00000000004552cc in main ()

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions