Skip to content
This repository was archived by the owner on Jun 24, 2024. It is now read-only.
This repository was archived by the owner on Jun 24, 2024. It is now read-only.

Support GGUF #365

@philpax

Description

@philpax

GGUF is the new file format specification that we've been designing that's designed to solve the problem of not being able to identify a model. The specification is here: ggml-org/ggml#302

llm should be able to do the following:

  • continue supporting existing models (i.e. this change should be non-destructive)
  • load GGUF models and automatically dispatch to the correct model.
    • load_dynamic already has an interface that should support this, but loading currently only begins after the model arch is known
    • use the new information stored within the metadata to improve the UX, including automatically using the HF tokenizer if available
  • save GGUF models, especially in quantization

llm could do the following:

  • convert old models to GGUF models with prompting for missing data
  • implement the migration tool mentioned in the spec, which does autonomous conversion for users based on hashes

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions