bug fix: handle saving/loading null layers in recurrent memory #14675

l3utterfly · 2025-07-14T10:53:28Z

Currently saving loading kv cache of recurrent memory crashes because layers can be null.

This mainly applies to the new LiquidAI/LFM2 models.

Tested with: https://huggingface.co/LiquidAI/LFM2-350M-GGUF

handle saving/loading null layers in recurrent memory

src/llama-memory-recurrent.cpp

compilade

Thanks @l3utterfly! I've tested this with a Jamba model and llama-save-load-state, and it was indeed failing before, and is fixed by this change.

I'll add a test case to #14139 (once I also add variants for hybrid models) to help automatically detecting this kind of regression with hybrid architectures in the future.

src/llama-memory-recurrent.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* origin/master: (49 commits) ci : correct label refactor->refactoring (ggml-org#14832) CUDA: fix quantized KV cache + multiple sequences (ggml-org#14822) tests : add non-cont K,V FA tests memory : handle saving/loading null layers in recurrent memory (ggml-org#14675) ggml: fix loongarch quantize_row_q8_1 error (ggml-org#14827) CANN: weight format to NZ for Ascend310P3 (ggml-org#14407) CUDA: add fused rms norm (ggml-org#14800) ggml : model card yaml tab->2xspace (ggml-org#14819) vulkan: fix rms_norm_mul to handle broadcasting dim0 (ggml-org#14817) llama : add model type detection for rwkv7 7B&14B (ggml-org#14816) imatrix: add option to display importance score statistics for a given imatrix file (ggml-org#12718) Mtmd: add a way to select device for vision encoder (ggml-org#14236) cuda : implement bf16 cpy ops and enable bf16 cont (ggml-org#14763) opencl: remove unreachable `return` (ggml-org#14806) server : allow setting `--reverse-prompt` arg (ggml-org#14799) cuda: remove linking to cublasLt (ggml-org#14790) opencl: fix `im2col` when `KW!=KH` (ggml-org#14803) opencl: add conv2d kernel (ggml-org#14403) sycl: Fix im2col (ggml-org#14797) kleidiai: add support for get_rows (ggml-org#14676) ...

…org#14675) * Update llama-memory-recurrent.cpp handle saving/loading null layers in recurrent memory * fixed styling issues and updated comments * fix styling issue Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>

Update llama-memory-recurrent.cpp

12ecec0

handle saving/loading null layers in recurrent memory

ggerganov reviewed Jul 14, 2025

View reviewed changes

src/llama-memory-recurrent.cpp Outdated Show resolved Hide resolved

ggerganov requested a review from compilade July 14, 2025 11:01

fixed styling issues and updated comments

8974f22

compilade approved these changes Jul 14, 2025

View reviewed changes

CISC reviewed Jul 14, 2025

View reviewed changes

src/llama-memory-recurrent.cpp Outdated Show resolved Hide resolved

fix styling issue

14de5a5

Co-authored-by: Sigbjørn Skjæret <[email protected]>

ggerganov merged commit 7233358 into ggml-org:master Jul 23, 2025
47 checks passed

l3utterfly deleted the rmem-save-load-fix branch August 24, 2025 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug fix: handle saving/loading null layers in recurrent memory #14675

bug fix: handle saving/loading null layers in recurrent memory #14675

Uh oh!

l3utterfly commented Jul 14, 2025

Uh oh!

Uh oh!

compilade left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bug fix: handle saving/loading null layers in recurrent memory #14675

bug fix: handle saving/loading null layers in recurrent memory #14675

Uh oh!

Conversation

l3utterfly commented Jul 14, 2025

Uh oh!

Uh oh!

compilade left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!