Skip to content

Conversation

@ggerganov
Copy link
Member

fix #16983

Perform the check for context shifting only while the slot is generating tokens. Should not perform it during the start or prompt processing phase.

@ggerganov ggerganov requested a review from ngxson as a code owner November 4, 2025 13:51
cchadowitz added a commit to Pixel-Forensics-Inc/llama.cpp that referenced this pull request Nov 4, 2025
@ggerganov ggerganov merged commit 66d8ecc into master Nov 4, 2025
65 of 67 checks passed
@ggerganov ggerganov deleted the gg/server-fix-ctx-shift-check branch November 4, 2025 17:21
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Nov 5, 2025
* origin/master: (21 commits)
vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (ggml-org#16919)
examples(gguf): GGUF example outputs (ggml-org#17025)
mtmd: allow QwenVL to process larger image by default (ggml-org#17020)
server : do not default to multiple slots with speculative decoding (ggml-org#17017)
mtmd: improve struct initialization (ggml-org#16981)
docs: Clarify the endpoint that webui uses (ggml-org#17001)
model : add openPangu-Embedded (ggml-org#16941)
ggml webgpu: minor set rows optimization (ggml-org#16810)
sync : ggml
ggml : fix conv2d_dw SVE path (ggml/1380)
CUDA: update ops.md (ggml-org#17005)
opencl: update doc (ggml-org#17011)
refactor: replace sprintf with snprintf for safer string handling in dump functions (ggml-org#16913)
vulkan: remove the need for the dryrun (ggml-org#16826)
server : do context shift only while generating (ggml-org#17000)
readme : update hot topics (ggml-org#17002)
ggml-cpu : bicubic interpolation (ggml-org#16891)
ci : apply model label to models (ggml-org#16994)
chore : fix models indent after refactor (ggml-org#16992)
Fix garbled output with REPACK at high thread counts (ggml-org#16956)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Regression: "context shift is disabled" after filling context

2 participants