Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

epic: llama.cpp is installed by default #1217

@dan-menlo

Description

@dan-menlo

Goal

Cortex.cpp should have a super easy UX to on par with market alternatives

  • User should have a 1-click installer, that prioritizes simple UX over size-complexity
    • Installer packages (or downloads at install time) llama.cpp binaries (e.g. up to 1gb)
    • Installer optimizes for "universal" installers (i.e. download all options, and then subsequently deletes all unnecessary files)
    • e.g. Mac Universal includes llama.cpp Mac for Intel and Apple Silicon
    • e.g. Windows and Nvidia Universal includes llama.cpp for both CUDA versions
  • For this epic, I am open to either:

Idea

I wonder whether the solution to this is a way to have an optional local lookup, as part of cortex engines install:

  • Installer can look in its installer folder to see if dependencies are available, and only pull from remote if needed
  • We do not need to make any changes to the installer (it still just runs cortex engines install)
  • This approach is elegant, and allows us flexibility in packaging

Out-of-scope (future)

  • We should offer a "cortex-alpine" installer which has minimal file size
    • Targeted for embedded use cases, and if people want to use ONNX or TensorRT-LLM without llama.cpp
    • User will have to download engines as a post-install step
  • We should offer "universal" installers that pre-package all potential dependencies
    • e.g. Large installer size, but packages all dependencies for offline install

Outcomes

  • Cortex.cpp installer should install llama.cpp by default
  • Cortex.cpp installer should install the correct version of llama.cpp (based on hardware)

Key Questions

  • Should we align with llama.cpp's versions? (e.g. with Vulkan, sycl)

Appendix

Why?

Our current cortex.cpp v0.1 onboarding UX is not user friendly:

  • llama.cpp only seems to be downloaded on first run of Cortex (at least on Windows)
    • Download UX is not very good (no progress indicator)
    • Download is often slow, or drops

Image

  • Very often, the llama.cpp engine download does not work, resulting in bad UX
    • "Engine not loaded yet"

Image

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions