Vulkan: Odd compute buffer behaviors at specific context breakpoints version b6568 and above

First became aware of issue when running the latest Koboldcpp release and previously workable configs started failing, discussed at lostruins#1805

====

Running [llama-**b6567**-bin-win-vulkan-x64](https://github.com/ggml-org/llama.cpp/releases/tag/b6567) llama-bench with an RX 480 8GB with a --ubatch-size of 512 and an --n-prompt value of first 14592, then 14593, the Vulkan0 compute buffer size is 1022.50 MiB, 1024.69 MiB respectively.

Repeating the same process with [llama-**b6568**-bin-win-vulkan-x64](https://github.com/ggml-org/llama.cpp/releases/tag/b6568), again with the same --n-prompt values of 14592, then 14593, the compute buffer size is 1022.50 MiB, 1910.13 MiB respectively. A sizeable discrepancy.

Below n-prompt 14592, both versions behave identically all the way down. Above 14593, b6568 continues accumulating as "normal" until 15040, clocking in at 1963.75 MiB, then at 15041 jumps down to 1611.48 MiB. For comparison, b6567 at 15041 allocates 1055.31 MiB to the buffer, and grows as normal every step of the way.

Finally, the allocation discrepancy between versions continues to diminish as the context increases, and can be assumed to disappear entirely at some point -- or perhaps even result in memory savings? Nah, that'd be too good I'm sure. I've only tested it up to 20k, where it's still some 450 mb above the b6567 "standard."

Tested ClBlast and Cuda via Zluda emulation to ensure that it's a vulkan-specific issue, and it is. Don't as of yet know if it's repeatable on other hardware and would love it if someone could test this out. What changed between [b6567](https://github.com/ggml-org/llama.cpp/releases/tag/b6567) and [b6568](https://github.com/ggml-org/llama.cpp/releases/tag/b6568) that could result in such behavior, and is it expected? And, I should obviously mention that, same behavior persists with [b6833](https://github.com/ggml-org/llama.cpp/releases/tag/b6833).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Vulkan: Odd compute buffer behaviors at specific context breakpoints version b6568 and above #16759

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Vulkan: Odd compute buffer behaviors at specific context breakpoints version b6568 and above #16759

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions