It'd be great if we could have the analog of https://github.com/ggerganov/llama.cpp/pull/11117 I'll try to put something together when I get the time if needed.