This repository was archived by the owner on Jul 4, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 181
This repository was archived by the owner on Jul 4, 2025. It is now read-only.
bug: Continue downloading of partial file doesn't work #1288
Copy link
Copy link
Closed
Labels
category: model managementModel pull, yaml, model stateModel pull, yaml, model statetype: bugSomething isn't workingSomething isn't working
Milestone
Description
Cortex version
v75
Describe the Bug
-
cortex pull llama3.1 onnx version
-
download times out half way (2.2 GB out of 5+ GB downloaded)
-
ctrl-c to force quit download (stuck in that state)
-
pull llama3.1 again
(base) PS C:\Users\n\cortexcpp-nightly\models> cortex-nightly.exe pull llama3.1
Select an option
1. 8b-onnx
5. 8b-gguf
6. onnx
7. gguf
Select an option (1-4): 1
Validating download items, please wait..
Start downloading: genai_config.json
genai_config.json is already downloaded!
Re-download? [Y/n]: Y
Re-downloading..
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1740 100 1740 0 0 5134 0 --:--:-- --:--:-- --:--:-- 5147
Start downloading: model.onnx
model.onnx is already downloaded!
Re-download? [Y/n]: Y
Re-downloading..
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 68 100 68 0 0 212 0 --:--:-- --:--:-- --:--:-- 211- Program correctly notices .data was not fully downloaded.
- I resume download
Start downloading: model.onnx.data
Found unfinished download! Additional 69.00 B need to be downloaded.
Continue download [Y/n]: Y
Resuming download..
** Resuming transfer from byte position -1
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 68 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl_easy_perform() failed: Requested range was not delivered by the server
Start downloading: model.yml
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 893 100 893 0 0 2632 0 --:--:-- --:--:-- --:--:-- 2642
Start downloading: special_tokens_map.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 68 100 68 0 0 215 0 --:--:-- --:--:-- --:--:-- 215
Start downloading: tokenizer.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 68 100 68 0 0 216 0 --:--:-- --:--:-- --:--:-- 217
Start downloading: tokenizer_config.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 68 100 68 0 0 213 0 --:--:-- --:--:-- --:--:-- 213
Model llama3.1 downloaded successfully!- download finishes really quickly, like 2 seconds.
- run the model. doesn't work.
(base) PS C:\Users\n\cortexcpp-nightly\models> cortex-nightly.exe run llama3.1
Starting server ...
20240921 08:25:51.227000 UTC 19540 INFO Host: 127.0.0.1 Port: 3928
- main.cc:32
Server started
gguf_init_from_file: failed to open '': 'Invalid argument'
{"timestamp":1726907151,"level":"ERROR","function":"LoadModel","line":186,"message":"llama.cpp unable to load model","model":""}
Model is not loaded yet!Ignoring the fact that Cortex tried to run this with llamacpp.
ls \llama3.1 shows model.yml is 2.25GB
However, actual .data binary is 5.31 GB
What is your OS?
- MacOS
- Windows
- Linux
But this should be a OS agnostic issue. Expected to occur on mac as well
What engine are you running?
- cortex.llamacpp (default)
- cortex.tensorrt-llm (Nvidia GPUs)
- cortex.onnx (NPUs, DirectML)
Metadata
Metadata
Assignees
Labels
category: model managementModel pull, yaml, model stateModel pull, yaml, model statetype: bugSomething isn't workingSomething isn't working