-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Closed
Labels
bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)
Description
What happened?
Hi,
Compiled server using VULKAN backend (as OpenCL was removed :sad:), I can start a server with a model. But, as far as an inference is requested, I've got an error message.
PS: Vulkan is theoretically not intended to replace OpenCL
GGML_ASSERT: /home/metal3d/Projects/ML/llama.cpp/ggml-vulkan.cpp:4069: d_D != nullptr
[New LWP 787893]
[New LWP 787894]
[New LWP 787895]
[New LWP 787897]
[New LWP 787900]
[New LWP 787917]
[New LWP 787918]
[New LWP 787919]
[New LWP 787924]
[New LWP 787925]
[New LWP 787926]
[New LWP 787927]
[New LWP 787928]
[New LWP 787929]
[New LWP 787930]
[New LWP 787931]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#0 0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#1 0x00000000005d811b in ggml_print_backtrace ()
#2 0x000000000067dafc in void ggml_vk_op_f32<vk_op_unary_push_constants>(ggml_backend_vk_context*, vk_context*, ggml_tensor const*, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, ggml_op, vk_op_unary_push_constants const&&) [clone .constprop.0] ()
#3 0x00000000006a1507 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) ()
#4 0x000000000062aad4 in ggml_backend_sched_graph_compute_async ()
#5 0x000000000056acb9 in llama_decode_internal(llama_context&, llama_batch) [clone .isra.0] ()
#6 0x000000000056c949 in llama_decode ()
#7 0x00000000004e0171 in server_context::update_slots() ()
#8 0x00000000004b1978 in server_queue::start_loop() ()
#9 0x00000000004552cc in main ()
Name and Version
version adc9ff3
Sorry: 2b33896
What operating system are you seeing the problem on?
Linux Fedora 40
Relevant log output
GGML_ASSERT: /home/metal3d/Projects/ML/llama.cpp/ggml-vulkan.cpp:4069: d_D != nullptr
[New LWP 787893]
[New LWP 787894]
[New LWP 787895]
[New LWP 787897]
[New LWP 787900]
[New LWP 787917]
[New LWP 787918]
[New LWP 787919]
[New LWP 787924]
[New LWP 787925]
[New LWP 787926]
[New LWP 787927]
[New LWP 787928]
[New LWP 787929]
[New LWP 787930]
[New LWP 787931]
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#0 0x00007f5864430e03 in wait4 () from /lib64/libc.so.6
#1 0x00000000005d811b in ggml_print_backtrace ()
#2 0x000000000067dafc in void ggml_vk_op_f32<vk_op_unary_push_constants>(ggml_backend_vk_context*, vk_context*, ggml_tensor const*, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, ggml_op, vk_op_unary_push_constants const&&) [clone .constprop.0] ()
#3 0x00000000006a1507 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) ()
#4 0x000000000062aad4 in ggml_backend_sched_graph_compute_async ()
#5 0x000000000056acb9 in llama_decode_internal(llama_context&, llama_batch) [clone .isra.0] ()
#6 0x000000000056c949 in llama_decode ()
#7 0x00000000004e0171 in server_context::update_slots() ()
#8 0x00000000004b1978 in server_queue::start_loop() ()
#9 0x00000000004552cc in main ()stduhpf
Metadata
Metadata
Assignees
Labels
bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)Used to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)