bug: Continue downloading of partial file doesn't work

### Cortex version

v75

### Describe the Bug

1. cortex pull llama3.1 onnx version
2. download times out half way (2.2 GB out of 5+ GB downloaded)
3. ctrl-c to force quit download (stuck in that state)

4. pull llama3.1 again
```sh
(base) PS C:\Users\n\cortexcpp-nightly\models> cortex-nightly.exe pull llama3.1
Select an option
1. 8b-onnx
5. 8b-gguf
6. onnx
7. gguf

Select an option (1-4): 1

Validating download items, please wait..
Start downloading: genai_config.json
genai_config.json is already downloaded!
Re-download? [Y/n]: Y
Re-downloading..
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1740  100  1740    0     0   5134      0 --:--:-- --:--:-- --:--:--  5147
Start downloading: model.onnx
model.onnx is already downloaded!
Re-download? [Y/n]: Y
Re-downloading..
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    68  100    68    0     0    212      0 --:--:-- --:--:-- --:--:--   211
```

5. Program correctly notices .data was not fully downloaded.
6. I resume download
```sh
Start downloading: model.onnx.data
Found unfinished download! Additional 69.00 B need to be downloaded.
Continue download [Y/n]: Y
Resuming download..
** Resuming transfer from byte position -1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0    68    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl_easy_perform() failed: Requested range was not delivered by the server
Start downloading: model.yml
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   893  100   893    0     0   2632      0 --:--:-- --:--:-- --:--:--  2642
Start downloading: special_tokens_map.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    68  100    68    0     0    215      0 --:--:-- --:--:-- --:--:--   215
Start downloading: tokenizer.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    68  100    68    0     0    216      0 --:--:-- --:--:-- --:--:--   217
Start downloading: tokenizer_config.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    68  100    68    0     0    213      0 --:--:-- --:--:-- --:--:--   213
Model llama3.1 downloaded successfully!
```
7. download finishes really quickly, like 2 seconds.
8. run the model. doesn't work. 

```sh
(base) PS C:\Users\n\cortexcpp-nightly\models> cortex-nightly.exe run llama3.1
Starting server ...
20240921 08:25:51.227000 UTC 19540 INFO  Host: 127.0.0.1 Port: 3928
 - main.cc:32
Server started
gguf_init_from_file: failed to open '': 'Invalid argument'
{"timestamp":1726907151,"level":"ERROR","function":"LoadModel","line":186,"message":"llama.cpp unable to load model","model":""}
Model is not loaded yet!
```


Ignoring the fact that Cortex tried to run this with llamacpp.

ls \llama3.1 shows model.yml is 2.25GB
However, actual [.data binary](https://huggingface.co/cortexso/llama3.1/tree/onnx) is 5.31 GB


### What is your OS?

- [ ] MacOS
- [X] Windows
- [ ] Linux

But this should be a OS agnostic issue. Expected to occur on mac as well

### What engine are you running?

- [ ] cortex.llamacpp (default)
- [ ] cortex.tensorrt-llm (Nvidia GPUs)
- [X] cortex.onnx (NPUs, DirectML)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: Continue downloading of partial file doesn't work #1288

Cortex version

Describe the Bug

What is your OS?

What engine are you running?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: Continue downloading of partial file doesn't work #1288

Description

Cortex version

Describe the Bug

What is your OS?

What engine are you running?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions