#
llama-cpp
Here are 5 public repositories matching this topic...
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
-
Updated
Aug 18, 2025 - Go
Unified management and routing for llama.cpp, MLX and vLLM models with web dashboard.
self-hosted mlx openai-api llm llamacpp llama-cpp vllm llm-inference localllm localllama llama-server llm-router mlx-lm
-
Updated
Oct 19, 2025 - Go
Local LLM proxy, DevOps friendly
inference inference-server inference-api openai-api llm openaiapi llamacpp llama-cpp local-llm localllm local-ai llm-proxy llama-api llama-server llm-router language-model-api local-lm local-llm-integration
-
Updated
Oct 18, 2025 - Go
Improve this page
Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."