Feature Request: The script convert_hf_to_gguf.py supports conversion of DeepSeek-R1-0528-FP4.

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

I'm trying to use convert_hf_to_gguf.py to convert the DeepSeek-R1-0528-FP4 safetensor files into gguf format. I hope llama.cpp can be further improved to fully support the conversion of models like DeepSeek-R1-0528-FP4 and other NVFP4-type formats.

### Motivation

Adding support to convert the DeepSeek-R1-0528-FP4 model into the GGUF format via convert_hf_to_gguf.py is a critical enhancement for the llama.cpp ecosystem. DeepSeek-R1-0528-FP4 is a recently released, high-performance language model developed by DeepSeek, featuring strong reasoning capabilities and optimized for efficient inference. As it gains increasing attention within the open-source community, more and more users are interested in running it locally on consumer-grade hardware—precisely the use case that llama.cpp is designed to enable.

However, llama.cpp only supports models in the GGUF format, which is specifically designed for memory efficiency, flexible quantization, and fast CPU/GPU inference. Without a reliable conversion pipeline from Hugging Face (safetensors) to GGUF, users cannot fully leverage the potential of this model within the llama.cpp framework.

Currently, attempts to convert DeepSeek-R1-0528-FP4 using the latest version of convert_hf_to_gguf.py result in runtime errors. These issues likely stem from architectural differences or metadata handling that the script does not yet fully support—such as custom layer configurations, tensor naming conventions, or FP4-specific quantization logic.

Enabling this conversion would provide several key benefits:

Local Inference Accessibility: Users could run DeepSeek-R1-0528-FP4 directly on personal devices without relying on cloud APIs, ensuring data privacy, low latency, and offline usability.

Efficient Hardware Utilization: The advanced quantization options in GGUF allow users to run large models on systems with limited VRAM or even on CPUs, significantly lowering the hardware barrier to entry.

Enhanced Compatibility with Existing Tools: Once converted, the model can be seamlessly integrated into a wide range of llama.cpp frontends (e.g., LM Studio, Ollama, text-generation-webui), improving user experience and ecosystem interoperability.

Community Empowerment: Providing official or community-supported conversion enables broader adoption, fine-tuning, and the development of downstream applications built upon this powerful model.

In summary, supporting the conversion of DeepSeek-R1-0528-FP4 in convert_hf_to_gguf.py is not merely a technical improvement—it is a strategic step toward maintaining inclusivity, performance, and future-readiness within the local LLM community. Resolving the current conversion issues will empower users to harness one of the most promising open-source models of 2024 within the efficient and versatile llama.cpp runtime.

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: The script convert_hf_to_gguf.py supports conversion of DeepSeek-R1-0528-FP4. #15415

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: The script convert_hf_to_gguf.py supports conversion of DeepSeek-R1-0528-FP4. #15415

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions