Skip to content

Regression: "context shift is disabled" after filling context #16983

@nifgraup

Description

@nifgraup

Name and Version

$ ./build/bin/llama-server --version
version: 6940 (c5023da)
built with cc (Debian 14.2.0-19) 14.2.0 for x86_64-linux-gnu

Operating systems

No response

Which llama.cpp modules do you know to be affected?

No response

Command line

Problem description & steps to reproduce

  • start llama-server with any model and a small context for testing
  • make a new conversation at http://localhost:8080/ and chat until context is full
  • switch to a new conversation and try to chat

What happens:
UI reports "Server Error
The server responded with an error message. Review the details below.
No response received from server. Please try again."

What should happen:
The second conversation should work.

First Bad Commit

commit b52edd25586fabb70f0c21b274473b307cf14499
Author: Georgi Gerganov <[email protected]>
Date:   Thu Oct 30 18:42:57 2025 +0200

    server : remove n_past (#16818)

Relevant log output

srv  params_from_: Chat format: Content-only
slot get_availabl: id  3 | task -1 | selected slot by LCP similarity, sim_best = 0.333 (> 0.100 thold), f_keep = 0.030
srv  get_availabl: updating prompt cache
srv   prompt_save:  - saving prompt with length 99, total state size = 10.830 MiB
srv         alloc:  - prompt is already in the cache, skipping
srv          load:  - looking for better prompt, base f_keep = 0.030, sim = 0.333
srv        update:  - cache state: 1 prompts, 10.830 MiB (limits: 8192.000 MiB, 100 tokens, 74885 est)
srv        update:    - prompt 0x55ad16cdda50:      99 tokens, checkpoints:  0,    10.830 MiB
srv  get_availabl: prompt cache update took 0.04 ms
slot launch_slot_: id  3 | task 25 | processing task
srv    send_error: task id = 25, error: context shift is disabled
slot      release: id  3 | task 25 | stop processing: n_tokens = 99, truncated = 0
srv  update_slots: no tokens to decode
srv  update_slots: all slots are idle
srv  cancel_tasks: cancel task, id_task = 25
srv  update_slots: all slots are idle

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions