Skip to content

Commit 4df7a22

Browse files
Merge pull request #2 from AbdullahAlfaraj/euler_a_redesign_merge
Euler A redesign merge
2 parents 700c85d + a659e02 commit 4df7a22

File tree

105 files changed

+5337
-1461
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

105 files changed

+5337
-1461
lines changed

.github/workflows/pr_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121
runs-on: [ self-hosted, docker-gpu ]
2222
container:
2323
image: python:3.7
24-
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
24+
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
2525

2626
steps:
2727
- name: Checkout diffusers

.github/workflows/push_tests.yml

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,10 @@ env:
1515
jobs:
1616
run_tests_single_gpu:
1717
name: Diffusers tests
18-
strategy:
19-
fail-fast: false
20-
matrix:
21-
machine_type: [ single-gpu ]
22-
runs-on: [ self-hosted, docker-gpu, '${{ matrix.machine_type }}' ]
18+
runs-on: [ self-hosted, docker-gpu, single-gpu ]
2319
container:
2420
image: nvcr.io/nvidia/pytorch:22.07-py3
25-
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
21+
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache
2622

2723
steps:
2824
- name: Checkout diffusers
@@ -66,14 +62,10 @@ jobs:
6662

6763
run_examples_single_gpu:
6864
name: Examples tests
69-
strategy:
70-
fail-fast: false
71-
matrix:
72-
machine_type: [ single-gpu ]
73-
runs-on: [ self-hosted, docker-gpu, '${{ matrix.machine_type }}' ]
65+
runs-on: [ self-hosted, docker-gpu, single-gpu ]
7466
container:
7567
image: nvcr.io/nvidia/pytorch:22.07-py3
76-
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
68+
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache
7769

7870
steps:
7971
- name: Checkout diffusers

README.md

Lines changed: 48 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -74,17 +74,18 @@ You need to accept the model license before downloading or using the Stable Diff
7474

7575
### Text-to-Image generation with Stable Diffusion
7676

77+
We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full
78+
precision while being roughly twice as fast and requiring half the amount of GPU RAM.
79+
7780
```python
7881
# make sure you're logged in with `huggingface-cli login`
79-
from torch import autocast
8082
from diffusers import StableDiffusionPipeline
8183

82-
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
84+
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_type=torch.float16, revision="fp16")
8385
pipe = pipe.to("cuda")
8486

8587
prompt = "a photo of an astronaut riding a horse on mars"
86-
with autocast("cuda"):
87-
image = pipe(prompt).images[0]
88+
image = pipe(prompt).images[0]
8889
```
8990

9091
**Note**: If you don't want to use the token, you can also simply download the model weights
@@ -104,30 +105,27 @@ pipe = StableDiffusionPipeline.from_pretrained("./stable-diffusion-v1-4")
104105
pipe = pipe.to("cuda")
105106

106107
prompt = "a photo of an astronaut riding a horse on mars"
107-
with autocast("cuda"):
108-
image = pipe(prompt).images[0]
108+
image = pipe(prompt).images[0]
109109
```
110110

111-
If you are limited by GPU memory, you might want to consider using the model in `fp16` as
112-
well as chunking the attention computation.
111+
If you are limited by GPU memory, you might want to consider chunking the attention computation in addition
112+
to using `fp16`.
113113
The following snippet should result in less than 4GB VRAM.
114114

115115
```python
116116
pipe = StableDiffusionPipeline.from_pretrained(
117117
"CompVis/stable-diffusion-v1-4",
118118
revision="fp16",
119119
torch_dtype=torch.float16,
120-
use_auth_token=True
121120
)
122121
pipe = pipe.to("cuda")
123122

124123
prompt = "a photo of an astronaut riding a horse on mars"
125124
pipe.enable_attention_slicing()
126-
with autocast("cuda"):
127-
image = pipe(prompt).images[0]
125+
image = pipe(prompt).images[0]
128126
```
129127

130-
Finally, if you wish to use a different scheduler, you can simply instantiate
128+
If you wish to use a different scheduler, you can simply instantiate
131129
it before the pipeline and pass it to `from_pretrained`.
132130

133131
```python
@@ -144,13 +142,29 @@ pipe = StableDiffusionPipeline.from_pretrained(
144142
revision="fp16",
145143
torch_dtype=torch.float16,
146144
scheduler=lms,
147-
use_auth_token=True
148145
)
149146
pipe = pipe.to("cuda")
150147

151148
prompt = "a photo of an astronaut riding a horse on mars"
152-
with autocast("cuda"):
153-
image = pipe(prompt).images[0]
149+
image = pipe(prompt).images[0]
150+
151+
image.save("astronaut_rides_horse.png")
152+
```
153+
154+
If you want to run Stable Diffusion on CPU or you want to have maximum precision on GPU,
155+
please run the model in the default *full-precision* setting:
156+
157+
```python
158+
# make sure you're logged in with `huggingface-cli login`
159+
from diffusers import StableDiffusionPipeline
160+
161+
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
162+
163+
# disable the following line if you run on CPU
164+
pipe = pipe.to("cuda")
165+
166+
prompt = "a photo of an astronaut riding a horse on mars"
167+
image = pipe(prompt).images[0]
154168

155169
image.save("astronaut_rides_horse.png")
156170
```
@@ -160,7 +174,6 @@ image.save("astronaut_rides_horse.png")
160174
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.
161175

162176
```python
163-
from torch import autocast
164177
import requests
165178
import torch
166179
from PIL import Image
@@ -175,10 +188,9 @@ pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
175188
model_id_or_path,
176189
revision="fp16",
177190
torch_dtype=torch.float16,
178-
use_auth_token=True
179191
)
180192
# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
181-
# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`.
193+
# and pass `model_id_or_path="./stable-diffusion-v1-4"`.
182194
pipe = pipe.to(device)
183195

184196
# let's download an initial image
@@ -190,8 +202,7 @@ init_image = init_image.resize((768, 512))
190202

191203
prompt = "A fantasy landscape, trending on artstation"
192204

193-
with autocast("cuda"):
194-
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
205+
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
195206

196207
images[0].save("fantasy_landscape.png")
197208
```
@@ -204,7 +215,6 @@ The `StableDiffusionInpaintPipeline` lets you edit specific parts of an image by
204215
```python
205216
from io import BytesIO
206217

207-
from torch import autocast
208218
import torch
209219
import requests
210220
import PIL
@@ -227,15 +237,13 @@ pipe = StableDiffusionInpaintPipeline.from_pretrained(
227237
model_id_or_path,
228238
revision="fp16",
229239
torch_dtype=torch.float16,
230-
use_auth_token=True
231240
)
232241
# or download via git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
233-
# and pass `model_id_or_path="./stable-diffusion-v1-4"` without having to use `use_auth_token=True`.
242+
# and pass `model_id_or_path="./stable-diffusion-v1-4"`.
234243
pipe = pipe.to(device)
235244

236245
prompt = "a cat sitting on a bench"
237-
with autocast("cuda"):
238-
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
246+
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
239247

240248
images[0].save("cat_on_bench.png")
241249
```
@@ -258,7 +266,6 @@ If you want to run the code yourself 💻, you can try out:
258266
- [Text-to-Image Latent Diffusion](https://huggingface.co/CompVis/ldm-text2im-large-256)
259267
```python
260268
# !pip install diffusers transformers
261-
from torch import autocast
262269
from diffusers import DiffusionPipeline
263270

264271
device = "cuda"
@@ -270,16 +277,14 @@ ldm = ldm.to(device)
270277

271278
# run pipeline in inference (sample random noise and denoise)
272279
prompt = "A painting of a squirrel eating a burger"
273-
with autocast(device):
274-
image = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6).images[0]
280+
image = ldm([prompt], num_inference_steps=50, eta=0.3, guidance_scale=6).images[0]
275281

276282
# save image
277283
image.save("squirrel.png")
278284
```
279285
- [Unconditional Diffusion with discrete scheduler](https://huggingface.co/google/ddpm-celebahq-256)
280286
```python
281287
# !pip install diffusers
282-
from torch import autocast
283288
from diffusers import DDPMPipeline, DDIMPipeline, PNDMPipeline
284289

285290
model_id = "google/ddpm-celebahq-256"
@@ -290,8 +295,7 @@ ddpm = DDPMPipeline.from_pretrained(model_id) # you can replace DDPMPipeline wi
290295
ddpm.to(device)
291296

292297
# run pipeline in inference (sample random noise and denoise)
293-
with autocast("cuda"):
294-
image = ddpm().images[0]
298+
image = ddpm().images[0]
295299

296300
# save image
297301
image.save("ddpm_generated_image.png")
@@ -377,3 +381,16 @@ This library concretizes previous work by many different authors and would not h
377381
- @yang-song's Score-VE and Score-VP implementations, available [here](https://github.com/yang-song/score_sde_pytorch)
378382

379383
We also want to thank @heejkoo for the very helpful overview of papers, code and resources on diffusion models, available [here](https://github.com/heejkoo/Awesome-Diffusion-Models) as well as @crowsonkb and @rromb for useful discussions and insights.
384+
385+
## Citation
386+
387+
```bibtex
388+
@misc{von-platen-etal-2022-diffusers,
389+
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
390+
title = {Diffusers: State-of-the-art diffusion models},
391+
year = {2022},
392+
publisher = {GitHub},
393+
journal = {GitHub repository},
394+
howpublished = {\url{https://github.com/huggingface/diffusers}}
395+
}
396+
```

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212
title: "Loading Pipelines, Models, and Schedulers"
1313
- local: using-diffusers/configuration
1414
title: "Configuring Pipelines, Models, and Schedulers"
15+
- local: using-diffusers/custom_pipelines
16+
title: "Loading and Creating Custom Pipelines"
1517
title: "Loading"
1618
- sections:
1719
- local: using-diffusers/unconditional_image_generation

docs/source/api/pipelines/overview.mdx

Lines changed: 6 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -98,15 +98,13 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing
9898

9999
```python
100100
# make sure you're logged in with `huggingface-cli login`
101-
from torch import autocast
102101
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
103102

104-
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
103+
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")
105104
pipe = pipe.to("cuda")
106105

107106
prompt = "a photo of an astronaut riding a horse on mars"
108-
with autocast("cuda"):
109-
image = pipe(prompt).images[0]
107+
image = pipe(prompt).images[0]
110108

111109
image.save("astronaut_rides_horse.png")
112110
```
@@ -116,7 +114,6 @@ image.save("astronaut_rides_horse.png")
116114
The `StableDiffusionImg2ImgPipeline` lets you pass a text prompt and an initial image to condition the generation of new images.
117115

118116
```python
119-
from torch import autocast
120117
import requests
121118
from PIL import Image
122119
from io import BytesIO
@@ -126,7 +123,7 @@ from diffusers import StableDiffusionImg2ImgPipeline
126123
# load the pipeline
127124
device = "cuda"
128125
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
129-
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True
126+
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16
130127
).to(device)
131128

132129
# let's download an initial image
@@ -138,8 +135,7 @@ init_image = init_image.resize((768, 512))
138135

139136
prompt = "A fantasy landscape, trending on artstation"
140137

141-
with autocast("cuda"):
142-
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
138+
images = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images
143139

144140
images[0].save("fantasy_landscape.png")
145141
```
@@ -157,7 +153,6 @@ The `StableDiffusionInpaintPipeline` lets you edit specific parts of an image by
157153
```python
158154
from io import BytesIO
159155

160-
from torch import autocast
161156
import requests
162157
import PIL
163158

@@ -177,12 +172,11 @@ mask_image = download_image(mask_url).resize((512, 512))
177172

178173
device = "cuda"
179174
pipe = StableDiffusionInpaintPipeline.from_pretrained(
180-
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16, use_auth_token=True
175+
"CompVis/stable-diffusion-v1-4", revision="fp16", torch_dtype=torch.float16
181176
).to(device)
182177

183178
prompt = "a cat sitting on a bench"
184-
with autocast("cuda"):
185-
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
179+
images = pipe(prompt=prompt, init_image=init_image, mask_image=mask_image, strength=0.75).images
186180

187181
images[0].save("cat_on_bench.png")
188182
```

docs/source/api/schedulers.mdx

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,16 +36,15 @@ This allows for rapid experimentation and cleaner abstractions in the code, wher
3636
To this end, the design of schedulers is such that:
3737

3838
- Schedulers can be used interchangeably between diffusion models in inference to find the preferred trade-off between speed and generation quality.
39-
- Schedulers are currently by default in PyTorch, but are designed to be framework independent (partial Numpy support currently exists).
39+
- Schedulers are currently by default in PyTorch, but are designed to be framework independent (partial Jax support currently exists).
4040

4141

4242
## API
4343

4444
The core API for any new scheduler must follow a limited structure.
4545
- Schedulers should provide one or more `def step(...)` functions that should be called to update the generated sample iteratively.
4646
- Schedulers should provide a `set_timesteps(...)` method that configures the parameters of a schedule function for a specific inference task.
47-
- Schedulers should be framework-agnostic, but provide a simple functionality to convert the scheduler into a specific framework, such as PyTorch
48-
with a `set_format(...)` method.
47+
- Schedulers should be framework-specific.
4948

5049
The base class [`SchedulerMixin`] implements low level utilities used by multiple schedulers.
5150

docs/source/index.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ available a colab notebook to directly try them out.
3535
| Pipeline | Paper | Tasks | Colab
3636
|---|---|:---:|:---:|
3737
| [ddpm](./api/pipelines/ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
38-
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
38+
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation |
3939
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation |
4040
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
4141
| [pndm](./api/pipelines/pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |

0 commit comments

Comments
 (0)