This repository was archived by the owner on Jul 4, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 181
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
epic: llama.cpp is installed by default #1217
Copy link
Copy link
Closed
Milestone
Description
Goal
Cortex.cpp should have a super easy UX to on par with market alternatives
- User should have a 1-click installer, that prioritizes simple UX over size-complexity
- Installer packages (or downloads at install time) llama.cpp binaries (e.g. up to 1gb)
- Installer optimizes for "universal" installers (i.e. download all options, and then subsequently deletes all unnecessary files)
- e.g. Mac Universal includes llama.cpp Mac for Intel and Apple Silicon
- e.g. Windows and Nvidia Universal includes llama.cpp for both CUDA versions
- For this epic, I am open to either:
- Pre-packaging (preferred)
- Install-time download of dependencies: https://github.com/janhq/cortex.cpp/pull/1219/files
Idea
I wonder whether the solution to this is a way to have an optional local lookup, as part of cortex engines install:
- Installer can look in its installer folder to see if dependencies are available, and only pull from remote if needed
- We do not need to make any changes to the installer (it still just runs
cortex engines install) - This approach is elegant, and allows us flexibility in packaging
Out-of-scope (future)
- We should offer a "cortex-alpine" installer which has minimal file size
- Targeted for embedded use cases, and if people want to use ONNX or TensorRT-LLM without llama.cpp
- User will have to download engines as a post-install step
- We should offer "universal" installers that pre-package all potential dependencies
- e.g. Large installer size, but packages all dependencies for offline install
Outcomes
- Cortex.cpp installer should install llama.cpp by default
- Cortex.cpp installer should install the correct version of llama.cpp (based on hardware)
Key Questions
- Should we align with llama.cpp's versions? (e.g. with Vulkan,
sycl)
Appendix
Why?
Our current cortex.cpp v0.1 onboarding UX is not user friendly:
- llama.cpp only seems to be downloaded on first run of Cortex (at least on Windows)
- Download UX is not very good (no progress indicator)
- Download is often slow, or drops
- Very often, the llama.cpp engine download does not work, resulting in bad UX
- "Engine not loaded yet"
Metadata
Metadata
Assignees
Labels
No labels

