Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion llama.cpp
Submodule llama.cpp updated 99 files
+91 −0 .github/workflows/build-linux-cross.yml
+2 −0 common/CMakeLists.txt
+125 −105 common/chat.cpp
+2 −0 common/chat.h
+19 −0 common/common.cpp
+4 −4 common/common.h
+9 −5 common/minja/chat-template.hpp
+69 −36 common/minja/minja.hpp
+204 −0 common/regex-partial.cpp
+56 −0 common/regex-partial.h
+23 −2 convert_hf_to_gguf.py
+2 −0 docs/backend/SYCL.md
+1 −1 docs/multimodal.md
+1 −0 ggml/CMakeLists.txt
+2 −2 ggml/src/ggml-cpu/CMakeLists.txt
+195 −0 ggml/src/ggml-cpu/ggml-cpu-quants.c
+4 −0 ggml/src/ggml-cpu/ggml-cpu.c
+1 −0 ggml/src/ggml-cpu/kleidiai/kernels.h
+2 −0 ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
+13 −5 ggml/src/ggml-cuda/fattn-common.cuh
+260 −63 ggml/src/ggml-cuda/fattn-mma-f16.cuh
+2 −1 ggml/src/ggml-cuda/fattn.cu
+1 −1 ggml/src/ggml-cuda/ggml-cuda.cu
+2 −0 ggml/src/ggml-cuda/mmq.cu
+7 −6 ggml/src/ggml-cuda/quantize.cu
+1 −1 ggml/src/ggml-metal/ggml-metal.m
+5 −0 ggml/src/ggml-metal/ggml-metal.metal
+0 −2 ggml/src/ggml-opencl/ggml-opencl.cpp
+26 −22 ggml/src/ggml-sycl/CMakeLists.txt
+121 −232 ggml/src/ggml-sycl/binbcast.cpp
+29 −2 ggml/src/ggml-sycl/convert.cpp
+59 −21 ggml/src/ggml-sycl/dequantize.hpp
+7 −1 ggml/src/ggml-sycl/dmmv.cpp
+0 −23 ggml/src/ggml-sycl/element_wise.cpp
+37 −8 ggml/src/ggml-sycl/gemm.hpp
+192 −82 ggml/src/ggml-sycl/ggml-sycl.cpp
+29 −2 ggml/src/ggml-sycl/mmvq.cpp
+22 −0 ggml/src/ggml-sycl/quants.hpp
+69 −43 ggml/src/ggml-sycl/vecdotq.hpp
+78 −89 ggml/src/ggml-vulkan/CMakeLists.txt
+184 −53 ggml/src/ggml-vulkan/ggml-vulkan.cpp
+17 −0 ggml/src/ggml-vulkan/vulkan-shaders/CMakeLists.txt
+4 −3 ggml/src/ggml-vulkan/vulkan-shaders/flash_attn.comp
+506 −0 ggml/src/ggml-vulkan/vulkan-shaders/flash_attn_cm1.comp
+12 −1 ggml/src/ggml-vulkan/vulkan-shaders/vulkan-shaders-gen.cpp
+33 −33 ggml/src/gguf.cpp
+3 −0 gguf-py/gguf/constants.py
+7 −3 gguf-py/gguf/scripts/gguf_editor_gui.py
+30 −29 gguf-py/gguf/tensor_mapping.py
+1 −1 include/llama.h
+329 −121 scripts/compare-llama-bench.py
+1 −1 scripts/sync-ggml.last
+3 −0 src/llama-arch.cpp
+4 −2 src/llama-context.cpp
+8 −0 src/llama-kv-cache.cpp
+4 −10 src/llama-kv-cache.h
+12 −7 src/llama-model-loader.cpp
+207 −33 src/llama-model.cpp
+13 −11 src/llama-quant.cpp
+5 −0 src/llama.cpp
+1 −0 tests/CMakeLists.txt
+3 −1 tests/test-chat.cpp
+288 −0 tests/test-regex-partial.cpp
+2 −2 tools/batched-bench/batched-bench.cpp
+4 −3 tools/llama-bench/README.md
+92 −27 tools/llama-bench/llama-bench.cpp
+0 −35 tools/mtmd/CMakeLists.txt
+0 −44 tools/mtmd/README-quantize.md
+4 −3 tools/mtmd/README.md
+0 −53 tools/mtmd/android/adb_run.sh
+0 −8 tools/mtmd/android/build_64.sh
+0 −59 tools/mtmd/clip-quantize-cli.cpp
+0 −156 tools/mtmd/clip.cpp
+45 −81 tools/mtmd/clip.h
+0 −0 tools/mtmd/legacy-models/convert_image_encoder_to_gguf.py
+0 −0 tools/mtmd/legacy-models/glmedge-convert-image-encoder-to-gguf.py
+0 −0 tools/mtmd/legacy-models/glmedge-surgery.py
+0 −0 tools/mtmd/legacy-models/llava_surgery.py
+0 −0 tools/mtmd/legacy-models/llava_surgery_v2.py
+0 −0 tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
+0 −0 tools/mtmd/legacy-models/minicpmv-surgery.py
+0 −591 tools/mtmd/llava.cpp
+0 −49 tools/mtmd/llava.h
+0 −636 tools/mtmd/qwen2vl-test.cpp
+13 −76 tools/quantize/quantize.cpp
+1 −1 tools/server/README.md
+ tools/server/public/index.html.gz
+24 −13 tools/server/server.cpp
+12 −0 tools/server/tests/unit/test_completion.py
+49 −0 tools/server/tests/unit/test_template.py
+1 −1 tools/server/tests/unit/test_tool_call.py
+13 −1 tools/server/utils.hpp
+280 −6 tools/server/webui/package-lock.json
+3 −1 tools/server/webui/package.json
+4 −0 tools/server/webui/src/Config.ts
+34 −1 tools/server/webui/src/components/ChatScreen.tsx
+14 −11 tools/server/webui/src/components/SettingDialog.tsx
+111 −4 tools/server/webui/src/components/useChatExtraContext.tsx
+5 −4 tools/server/webui/vite.config.ts
Loading