You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* add wan vae suppport
* add wan model support
* add umt5 support
* add wan2.1 t2i support
* make flash attn work with wan
* make wan a little faster
* add wan2.1 t2v support
* add wan gguf support
* add offload params to cpu support
* add wan2.1 i2v support
* crop image before resize
* set default fps to 16
* add diff lora support
* fix wan2.1 i2v
* introduce sd_sample_params_t
* add wan2.2 t2v support
* add wan2.2 14B i2v support
* add wan2.2 ti2v support
* add high noise lora support
* sync: update ggml submodule url
* avoid build failure on linux
* avoid build failure
* update ggml
* update ggml
* fix sd_version_is_wan
* update ggml, fix cpu im2col_3d
* fix ggml_nn_attention_ext mask
* add cache support to ggml runner
* fix the issue of illegal memory access
* unify image loading processing
* add wan2.1/2.2 FLF2V support
* fix end_image mask
* update to latest ggml
* add GGUFReader
* update docs
Copy file name to clipboardExpand all lines: README.md
+48-20Lines changed: 48 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,19 +4,33 @@
4
4
5
5
# stable-diffusion.cpp
6
6
7
-
Inference of Stable Diffusion and Flux in pure C/C++
7
+
Diffusion model(SD,Flux,Wan,...) inference in pure C/C++
8
+
9
+
***Note that this project is under active development. \
10
+
API and command-line parameters may change frequently.***
8
11
9
12
## Features
10
13
11
14
- Plain C/C++ implementation based on [ggml](https://github.com/ggerganov/ggml), working in the same way as [llama.cpp](https://github.com/ggerganov/llama.cpp)
12
15
- Super lightweight and without external dependencies
13
-
- SD1.x, SD2.x, SDXL and [SD3/SD3.5](./docs/sd3.md) support
14
-
- !!!The VAE in SDXL encounters NaN issues under FP16, but unfortunately, the ggml_conv_2d only operates under FP16. Hence, a parameter is needed to specify the VAE that has fixed the FP16 NaN issue. You can find it here: [SDXL VAE FP16 Fix](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors).
15
-
-[Flux-dev/Flux-schnell Support](./docs/flux.md)
16
-
-[FLUX.1-Kontext-dev](./docs/kontext.md)
17
-
-[Chroma](./docs/chroma.md)
18
-
-[SD-Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) support
- !!!The VAE in SDXL encounters NaN issues under FP16, but unfortunately, the ggml_conv_2d only operates under FP16. Hence, a parameter is needed to specify the VAE that has fixed the FP16 NaN issue. You can find it here: [SDXL VAE FP16 Fix](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors).
- LoRA support, same as [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#lora)
31
+
- Latent Consistency Models support (LCM/LCM-LoRA)
32
+
- Faster and memory efficient latent decoding with [TAESD](https://github.com/madebyollin/taesd)
33
+
- Upscale images generated with [ESRGAN](https://github.com/xinntao/Real-ESRGAN)
20
34
- 16-bit, 32-bit float support
21
35
- 2-bit, 3-bit, 4-bit, 5-bit and 8-bit integer quantization support
22
36
- Accelerated memory-efficient CPU inference
@@ -26,15 +40,9 @@ Inference of Stable Diffusion and Flux in pure C/C++
26
40
- Can load ckpt, safetensors and diffusers models/checkpoints. Standalone VAEs models
27
41
- No need to convert to `.ggml` or `.gguf` anymore!
28
42
- Flash Attention for memory usage optimization
29
-
- Original `txt2img` and `img2img` mode
30
43
- Negative prompt
31
44
-[stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) style tokenizer (not all the features, only token weighting for now)
32
-
- LoRA support, same as [stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#lora)
33
-
- Latent Consistency Models support (LCM/LCM-LoRA)
34
-
- Faster and memory efficient latent decoding with [TAESD](https://github.com/madebyollin/taesd)
35
-
- Upscale images generated with [ESRGAN](https://github.com/xinntao/Real-ESRGAN)
36
45
- VAE tiling processing for reduce memory usage
37
-
- Control Net support with SD 1.5
38
46
- Sampling method
39
47
-`Euler A`
40
48
-`Euler`
@@ -287,8 +295,10 @@ arguments:
287
295
If threads <= 0, then threads will be set to the number of CPU physical cores
288
296
-m, --model [MODEL] path to full model
289
297
--diffusion-model path to the standalone diffusion model
298
+
--high-noise-diffusion-model path to the standalone high noise diffusion model
290
299
--clip_l path to the clip-l text encoder
291
300
--clip_g path to the clip-g text encoder
301
+
--clip_vision path to the clip-vision encoder
292
302
--t5xxl path to the t5xxl text encoder
293
303
--vae [VAE] path to vae
294
304
--taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality)
@@ -303,8 +313,9 @@ arguments:
303
313
If not specified, the default is the type of the weight file
304
314
--tensor-type-rules [EXPRESSION] weight type per tensor pattern (example: "^vae\.=f16,model\.=q8_0")
305
315
--lora-model-dir [DIR] lora model directory
306
-
-i, --init-img [IMAGE] path to the input image, required by img2img
316
+
-i, --init-img [IMAGE] path to the init image, required by img2img
307
317
--mask [MASK] path to the mask image, required by img2img with mask
318
+
-i, --end-img [IMAGE] path to the end image, required by flf2v
308
319
--control-image [IMAGE] path to image condition, control net
309
320
-r, --ref-image [PATH] reference image for Flux Kontext models (can be used multiple times)
310
321
-o, --output OUTPUT path to write result image to (default: ./output.png)
@@ -319,21 +330,34 @@ arguments:
319
330
--skip-layers LAYERS Layers to skip for SLG steps: (default: [7,8,9])
0 commit comments