Skip to content

Inconsistent size output of chunks with gguf-split #6634

@RodriMora

Description

@RodriMora

System:
Ubuntu server 22.04 LTS
5950X
64GB Ram 3200Mhz
2x Nvidia 3090

Steps to reproduce:

First I converted the model to FP16 GGUF with:
./convert.py --outfile Karasu-Mixtral-8x22B-v0.1-fp16.gguf --outtype f16 lightblue_Karasu-Mixtral-8x22B-v0.1

That worked just fine and I got:
image

Then to quantize it to Q5_K_M:
./quantize Karasu-Mixtral-8x22B-v0.1-fp16.gguf Karasu-Mixtral-8x22B-v0.1-Q5_K_M.gguf Q5_K_M

That worked fine too:
image

But when using gguf-split --split even though I'm using --split-max-tensors 128 the sizes of the chunks are inconsistent:

./gguf-split --split --split-max-tensors 128 /nfs/models/Karasu-Mixtral-8x22B-v0.1-Q5_K_M.gguf /nfs/models/

image
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingsplitGGUF split model sharding

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions