feat: Add support for split GGUF model loading #808

hongkongkiwi · 2025-08-23T09:06:51Z

Summary

Adds load_from_splits() method to LlamaModel for loading models split across multiple files
Adds split_path() utility function to generate standardized split file paths
Adds split_prefix() utility function to extract prefix from split file paths
Includes comprehensive example demonstrating split model usage

Features Added

Split Model Loading: Load large models split across multiple GGUF files
Path Utilities: Helper functions for working with split file naming conventions
Complete Example: Working example with command-line interface and auto-detection

API Changes

LlamaModel::load_from_splits(&backend, &[impl AsRef<Path>], &params) -> Result<Self, LlamaModelLoadError>
LlamaModel::split_path(prefix: &str, split_no: i32, split_count: i32) -> String
LlamaModel::split_prefix(split_path: &str, split_no: i32, split_count: i32) -> Option<String>

Test Plan

Builds successfully with no compilation errors
API matches upstream llama.cpp functions
Example compiles and runs without warnings
All documentation passes linting checks

Technical Details

The implementation uses the underlying llama_model_load_from_splits, llama_split_path, and llama_split_prefix functions from llama.cpp, providing safe Rust wrappers with proper error handling and memory management.

🤖 Generated with Claude Code

This commit introduces comprehensive support for loading models from multiple split files: - Added `load_from_splits()` method to LlamaModel for loading models split across multiple files - Added utility functions `split_path()` and `split_prefix()` for working with split file naming conventions - Added split_model example demonstrating usage of the split loading functionality - Updated workspace Cargo.toml to include the new split_model example This feature enables loading very large models that have been split due to filesystem limitations or distribution requirements. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Remove unused Path import from split_model example - Remove RPC example from workspace members on split-model-loading branch 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Added documentation comments for RopeType enum variants - Ensures all public APIs are properly documented 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

hongkongkiwi · 2025-08-23T09:20:20Z

This bascially adds in missing functionlaity into the rust library that's present in llama.cpp in the tools dir.

I think it's worth keeping feature parity so all llama.cpp features can be used.

…r split-model-loading This commit adds the scalable dynamic tools building system to the split-model-loading branch: - Adds generate_tools_cmake() function to dynamically create tools/CMakeLists.txt - Only builds tools for enabled features (solving PR utilityai#806 issue) - Split model loading doesn't require tools but maintains architecture consistency - Includes tools/CMakeLists.txt in Cargo.toml for build system compatibility - Uses feature-based conditional compilation for future extensibility This creates a merge-friendly architecture where each feature branch can extend tool building without conflicts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

MarcusDunn · 2025-08-24T22:29:21Z

this seems to include #810, please separate these PRs properly.

hongkongkiwi and others added 3 commits August 23, 2025 16:26

fix: Remove unused import and clean up workspace members

b858f12

- Remove unused Path import from split_model example - Remove RPC example from workspace members on split-model-loading branch 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

docs: Add missing RopeType variant documentation

8f3cb9b

- Added documentation comments for RopeType enum variants - Ensures all public APIs are properly documented 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for split GGUF model loading #808

feat: Add support for split GGUF model loading #808

Uh oh!

hongkongkiwi commented Aug 23, 2025

Uh oh!

hongkongkiwi commented Aug 23, 2025

Uh oh!

MarcusDunn commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add support for split GGUF model loading #808

Are you sure you want to change the base?

feat: Add support for split GGUF model loading #808

Uh oh!

Conversation

hongkongkiwi commented Aug 23, 2025

Summary

Features Added

API Changes

Test Plan

Technical Details

Uh oh!

hongkongkiwi commented Aug 23, 2025

Uh oh!

MarcusDunn commented Aug 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants