AMD GPU support via llama.cpp HIPBLAS

Hello,

First of all thank you for your work on llamafile it seems like a great idea to simplify model usage.

It seems from the readme that at this stage llamafile does not support AMD GPUs.
The cuda.c in llamafile backend seems dedicated to cuda while ggml-cuda.h in llama.cpp has a GGML_USE_HIPBLAS option for ROCm support. ROCm support is now officially supported by llama.cpp according to their [README about hipBLAS](https://github.com/ggerganov/llama.cpp#hipblas)

I understand that ROCm support was maybe not priority#1 for llamafile but I was wondering if you had already tried to use the HIPBLAS llama.cpp option and have some insights on the work that would need to be done in llamafile in order to add this GPU family target.

from what I understand, the llama.cpp would take care of the GPU side of things, and llamafile would need to be modified to JIT-compile llama.cpp with the correct flags and maybe need a specific toolchain for the compilation (At least ROCm SDK).

Thanks for sharing your experience on this



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMD GPU support via llama.cpp HIPBLAS #92

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AMD GPU support via llama.cpp HIPBLAS #92

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions