Support GGUF

GGUF is the new file format specification that we've been designing that's designed to solve the problem of not being able to identify a model. The specification is here: https://github.com/ggerganov/ggml/pull/302

`llm` should be able to do the following:
- continue supporting existing models (i.e. this change should be non-destructive)
- load GGUF models and automatically dispatch to the correct model. 
  - `load_dynamic` already has an interface that should support this, but loading currently only begins *after* the model arch is known
  - use the new information stored within the metadata to improve the UX, including automatically using the HF tokenizer if available
- save GGUF models, especially in quantization

`llm` could do the following:
- convert old models to GGUF models with prompting for missing data
- implement the migration tool mentioned in the spec, which does autonomous conversion for users based on hashes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support GGUF #365

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support GGUF #365

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions