You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jun 24, 2024. It is now read-only.
GGUF is the new file format specification that we've been designing that's designed to solve the problem of not being able to identify a model. The specification is here: ggml-org/ggml#302
llm should be able to do the following:
continue supporting existing models (i.e. this change should be non-destructive)
load GGUF models and automatically dispatch to the correct model.
load_dynamic already has an interface that should support this, but loading currently only begins after the model arch is known
use the new information stored within the metadata to improve the UX, including automatically using the HF tokenizer if available
save GGUF models, especially in quantization
llm could do the following:
convert old models to GGUF models with prompting for missing data
implement the migration tool mentioned in the spec, which does autonomous conversion for users based on hashes
bkitano, radu-matei, archit-spec, robkorn, skyne98 and 15 more