Skip to content

AMD GPU support via llama.cpp HIPBLAS #92

@jeromew

Description

@jeromew

Hello,

First of all thank you for your work on llamafile it seems like a great idea to simplify model usage.

It seems from the readme that at this stage llamafile does not support AMD GPUs.
The cuda.c in llamafile backend seems dedicated to cuda while ggml-cuda.h in llama.cpp has a GGML_USE_HIPBLAS option for ROCm support. ROCm support is now officially supported by llama.cpp according to their README about hipBLAS

I understand that ROCm support was maybe not priority#1 for llamafile but I was wondering if you had already tried to use the HIPBLAS llama.cpp option and have some insights on the work that would need to be done in llamafile in order to add this GPU family target.

from what I understand, the llama.cpp would take care of the GPU side of things, and llamafile would need to be modified to JIT-compile llama.cpp with the correct flags and maybe need a specific toolchain for the compilation (At least ROCm SDK).

Thanks for sharing your experience on this

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions