Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 1becaff

Browse files
author
Gabrielle Ong
authored
Merge pull request #1628 from janhq/feat/cli-docs-cleanup
chore: CLI docs (remove chat, sidebar, nightly)
2 parents 39509bc + ae7d3b5 commit 1becaff

File tree

20 files changed

+166
-449
lines changed

20 files changed

+166
-449
lines changed

docs/docs/capabilities/models/index.mdx

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@ For details on each format, see the [Model Formats](/docs/capabilities/models/mo
3030
:::
3131

3232
## Built-in Models
33-
Cortex.cpp offers a range of built-in models that include popular open-source options. These models, hosted on HuggingFace as [Cortex Model Repositories](/docs/hub/cortex-hub), are pre-compiled for different engines, enabling each model to have multiple branches in various formats.
33+
Cortex offers a range of [Built-in models](/models) that include popular open-source options.
34+
35+
These models are hosted on [Cortex's HuggingFace](https://huggingface.co/cortexso) and are pre-compiled for different engines, enabling each model to have multiple branches in various formats.
3436

3537
### Built-in Model Variants
3638
Built-in models are made available across the following variants:
@@ -39,10 +41,7 @@ Built-in models are made available across the following variants:
3941
- **By Size**: `7b`, `13b`, and more.
4042
- **By quantizations**: `q4`, `q8`, and more.
4143

42-
:::info
43-
You can see our full list of Built-in Models [here](/models).
44-
:::
45-
4644
## Next steps
47-
- Cortex requires a `model.yaml` file to run a model. Find out more [here](/docs/capabilities/models/model-yaml).
48-
- Cortex supports multiple model hubs hosting built-in models. See details [here](/docs/model-sources).
45+
- See Cortex's list of [Built-in Models](/models).
46+
- Cortex supports multiple model hubs hosting built-in models. See details [here](/docs/capabilities/models/sources).
47+
- Cortex requires a `model.yaml` file to run a model. Find out more [here](/docs/capabilities/models/model-yaml).

docs/docs/capabilities/models/model-yaml.mdx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -179,7 +179,7 @@ Model load parameters include the options that control how Cortex.cpp runs the m
179179
| `prompt_template` | Template for formatting the prompt, including system messages and instructions. | Yes |
180180
| `engine` | The engine that run model, default to `llama-cpp` for local model with gguf format. | Yes |
181181

182-
All parameters from the `model.yml` file are used for running the model via the [CLI chat command](/docs/cli/chat) or [CLI run command](/docs/cli/run). These parameters also act as defaults when using the [model start API](/api-reference#tag/models/post/v1/models/start) through cortex.cpp.
182+
All parameters from the `model.yml` file are used for running the model via the [CLI run command](/docs/cli/run). These parameters also act as defaults when using the [model start API](/api-reference#tag/models/post/v1/models/start) through cortex.cpp.
183183

184184
## Runtime parameters
185185

@@ -217,8 +217,8 @@ The API is accessible at the `/v1/chat/completions` URL and accepts all paramete
217217
218218
With the `llama-cpp` engine, cortex.cpp accept all parameters from [`model.yml` inference section](#Inference Parameters) and accept all parameters from the chat completion API.
219219
220-
:::info
220+
<!-- :::info
221221
You can download all the supported model formats from the following:
222222
- [Cortex Model Repos](/docs/hub/cortex-hub)
223223
- [HuggingFace Model Repos](/docs/hub/hugging-face)
224-
:::
224+
::: -->
File renamed without changes.
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
title: Hugging Face
3+
description: Cortex supports all `GGUF` and `ONNX` models available in Huggingface repositories, providing access to a wide range of models.
4+
---
5+
6+
import Tabs from "@theme/Tabs";
7+
import TabItem from "@theme/TabItem";
8+
9+
Cortex.cpp supports all `GGUF` from the [Hugging Face Hub](https://huggingface.co).
10+
11+
You can pull HuggingFace models via:
12+
- repository handle: eg `author/model_id`
13+
- direct url: eg `https://huggingface.co/QuantFactory/OpenMath2-Llama3.1-8B-GGUF/blob/main/OpenMath2-Llama3.1-8B.Q4_0.gguf`
14+
15+
16+
## GGUF
17+
To view all available `GGUF` models on HuggingFace, select the `GGUF` tag in the Libraries section.
18+
19+
![HF GGUF](/img/docs/gguf.png)
20+
<Tabs>
21+
<TabItem value="MacOs/Linux" label="MacOs/Linux">
22+
```sh
23+
## Pull the Codestral-22B-v0.1-GGUF model from the bartowski organization
24+
cortex pull bartowski/Codestral-22B-v0.1-GGUF
25+
26+
# Pull the gemma-7b model from the google organization
27+
cortex pull https://huggingface.co/QuantFactory/OpenMath2-Llama3.1-8B-GGUF/blob/main/OpenMath2-Llama3.1-8B.Q4_0.gguf
28+
```
29+
</TabItem>
30+
<TabItem value="Windows" label="Windows">
31+
```sh
32+
## Pull the Codestral-22B-v0.1-GGUF model from the bartowski organization
33+
cortex.exe pull bartowski/Codestral-22B-v0.1-GGUF
34+
35+
# Pull the gemma-7b model from the google organization
36+
cortex.exe pull google/gemma-7b
37+
```
38+
</TabItem>
39+
</Tabs>
40+
41+
<!-- ## ONNX
42+
![HF ONNX](/img/docs/onnx.png)
43+
To view all available `ONNX` models on HuggingFace, select the `ONNX` tag in the Libraries section.
44+
<Tabs>
45+
<TabItem value="MacOs/Linux" label="MacOs/Linux">
46+
```sh
47+
## Pull the XLM-Roberta-Large-Vit-B-16Plus model from the immich-app organization
48+
cortex pull immich-app/XLM-Roberta-Large-Vit-B-16Plus
49+
50+
# Pull the mt0-base model from the bigscience organization
51+
cortex pull bigscience/mt0-base
52+
```
53+
</TabItem>
54+
<TabItem value="Windows" label="Windows">
55+
```sh
56+
## Pull the XLM-Roberta-Large-Vit-B-16Plus model from the immich-app organization
57+
cortex.exe pull immich-app/XLM-Roberta-Large-Vit-B-16Plus
58+
59+
# Pull the mt0-base model from the bigscience organization
60+
cortex.exe pull bigscience/mt0-base
61+
```
62+
</TabItem>
63+
</Tabs>
64+
65+
## TensorRT-LLM
66+
We are still working to support all available `TensorRT-LLM` models on HuggingFace. For now, Cortex.cpp only supports built-in `TensorRT-LLM` models, which can be downloaded from the [Cortex Model Repos](/docs/capabilities/models/sources/cortex-hub). -->

docs/docs/hub/index.mdx renamed to docs/docs/capabilities/models/sources/index.mdx

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,8 @@
11
---
2-
slug: /model-sources
32
title: Model Sources
3+
description: Model
44
---
55

6-
import DocCardList from "@theme/DocCardList";
7-
8-
:::warning
9-
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
10-
:::
11-
126
# Pulling Models in Cortex
137

148
Cortex provides a streamlined way to pull (download) machine learning models from Hugging Face and other third-party sources, as well as import models from local storage. This functionality allows users to easily access a variety of pre-trained models to enhance their applications.
@@ -348,6 +342,4 @@ Response:
348342
}
349343
```
350344

351-
With Cortex, pulling and managing models is simplified, allowing you to focus more on building your applications!
352-
353-
<DocCardList />
345+
With Cortex, pulling and managing models is simplified, allowing you to focus more on building your applications!
File renamed without changes.

docs/docs/chat-completions.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,5 +146,5 @@ Cortex also acts as an aggregator for remote inference requests from a single en
146146
:::note
147147
Learn more about Chat Completions capabilities:
148148
- [Chat Completions API Reference](/api-reference#tag/inference/post/chat/completions)
149-
- [Chat Completions CLI command](/docs/cli/chat)
149+
- [`cortex run` CLI command](/docs/cli/run)
150150
:::

docs/docs/cli/chat.mdx

Lines changed: 0 additions & 71 deletions
This file was deleted.

docs/docs/cli/cortex.mdx

Lines changed: 8 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,8 @@ slug: /cli
77
import Tabs from "@theme/Tabs";
88
import TabItem from "@theme/TabItem";
99

10-
:::warning
11-
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
12-
:::
13-
14-
# Cortex
15-
This command list all the available commands within the Cortex.cpp commands.
10+
# `cortex`
11+
This command list all the available commands within the Cortex commands.
1612

1713
## Usage
1814
:::info
@@ -21,48 +17,23 @@ You can use the `--verbose` flag to display more detailed output of the internal
2117
<Tabs>
2218
<TabItem value="MacOs/Linux" label="MacOs/Linux">
2319
```sh
24-
# Stable
2520
cortex
26-
27-
# Beta
28-
cortex-beta
29-
30-
# Nightly
31-
cortex-nightly
3221
```
3322
</TabItem>
3423
<TabItem value="Windows" label="Windows">
3524
```sh
36-
# Stable
3725
cortex.exe
38-
39-
# Beta
40-
cortex-beta.exe
41-
42-
# Nightly
43-
cortex-nightly.exe
4426
```
4527
</TabItem>
4628
</Tabs>
4729

48-
49-
## Command Chaining
50-
Cortex CLI's command chaining support allows multiple commands to be executed in sequence with a simplified syntax.
51-
52-
For example:
53-
54-
- [cortex run](/docs/cli/run)
55-
- [cortex chat](/docs/cli/chat)
56-
5730
## Sub Commands
5831

32+
- [cortex start](/docs/cli/start): Start the Cortex API server (starts automatically with other commands)
33+
- [cortex run](/docs/cli/run): Shortcut for `cortex models start`. Pull a remote model or start a local model, and start chatting.
34+
- [cortex pull](/docs/cli/pull): Download a model.
5935
- [cortex models](/docs/cli/models): Manage and configure models.
60-
- [cortex chat](/docs/cli/chat): Send a chat request to a model.
6136
- [cortex ps](/docs/cli/ps): Display active models and their operational status.
62-
- [cortex embeddings](/docs/cli/embeddings): Create an embedding vector representing the input text.
63-
- [cortex engines](/docs/cli/engines): Manage Cortex.cpp engines.
64-
- [cortex pull|download](/docs/cli/pull): Download a model.
65-
- [cortex run](/docs/cli/run): Shortcut to pull, start and chat with a model.
66-
- [cortex update](/docs/cli/update): Update the Cortex.cpp version.
67-
- [cortex start](/docs/cli/start): Start the Cortex.cpp API server.
68-
- [cortex stop](/docs/cli/stop): Stop the Cortex.cpp API server.
37+
- [cortex engines](/docs/cli/engines): Manage Cortex engines.
38+
- [cortex update](/docs/cli/update): Update the Cortex version.
39+
- [cortex stop](/docs/cli/stop): Stop the Cortex API server.

docs/docs/cli/ps.mdx

Lines changed: 14 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -7,59 +7,42 @@ slug: "ps"
77
import Tabs from "@theme/Tabs";
88
import TabItem from "@theme/TabItem";
99

10-
:::warning
11-
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
12-
:::
13-
1410
# `cortex ps`
1511

16-
This command shows the running model and its status.
17-
18-
12+
This command shows the running model and its status (Engine, RAM, VRAM, and Uptime).
1913

2014
## Usage
21-
:::info
22-
You can use the `--verbose` flag to display more detailed output of the internal processes. To apply this flag, use the following format: `cortex --verbose [subcommand]`.
23-
:::
2415
<Tabs>
2516
<TabItem value="MacOs/Linux" label="MacOs/Linux">
2617
```sh
27-
# Stable
2818
cortex ps [options]
29-
30-
# Beta
31-
cortex-beta ps [options]
32-
33-
# Nightly
34-
cortex-nightly ps [options]
3519
```
3620
</TabItem>
3721
<TabItem value="Windows" label="Windows">
3822
```sh
39-
# Stable
4023
cortex.exe ps [options]
41-
42-
# Beta
43-
cortex-beta.exe ps [options]
44-
45-
# Nightly
46-
cortex-nightly.exe ps [options]
4724
```
4825
</TabItem>
4926
</Tabs>
5027

51-
5228
For example, it returns the following table:
5329

5430
```bash
55-
+----------------+-----------+----------+-----------+-----------+
56-
| Model | Engine | RAM | VRAM | Up time |
57-
+----------------+-----------+----------+-----------+-----------+
58-
| tinyllama:gguf | llama-cpp | 35.16 MB | 601.02 MB | 5 seconds |
59-
+----------------+-----------+----------+-----------+-----------+
31+
> cortex ps
32+
+------------------------+-----------+-----------+-----------+-------------------------------+
33+
| Model | Engine | RAM | VRAM | Uptime |
34+
+------------------------+-----------+-----------+-----------+-------------------------------+
35+
| llama3.2:3b-gguf-q4-km | llama-cpp | 308.23 MB | 1.87 GB | 7 seconds |
36+
+------------------------+-----------+-----------+-----------+-------------------------------+
37+
| tinyllama:1b-gguf | llama-cpp | 35.16 MB | 636.18 MB | 1 hour, 5 minutes, 45 seconds |
38+
+------------------------+-----------+-----------+-----------+-------------------------------+
6039
```
6140
## Options
6241

6342
| Option | Description | Required | Default value | Example |
6443
|-------------------|-------------------------------------------------------|----------|---------------|-------------|
65-
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |
44+
| `-h`, `--help` | Display help information for the command. | No | - | `-h` |
45+
46+
:::info
47+
You can use the `--verbose` flag to display more detailed output of the internal processes. To apply this flag, use the following format: `cortex --verbose [subcommand]`.
48+
:::

0 commit comments

Comments
 (0)