Skip to content

Commit d38c804

Browse files
Revistfjaanton-lpatrickvonplaten
authored
feat: add repaint (#974)
* feat: add repaint * fix: fix quality check with `make fix-copies` * fix: remove old unnecessary arg * chore: change default to DDPM (looks better in experiments) * ".to(device)" changed to "device=" Co-authored-by: Anton Lozhkov <[email protected]> * make generator device-specific Co-authored-by: Anton Lozhkov <[email protected]> * make generator device-specific and change shape Co-authored-by: Anton Lozhkov <[email protected]> * fix: add preprocessing for image and mask Co-authored-by: Anton Lozhkov <[email protected]> * fix: update test Co-authored-by: Anton Lozhkov <[email protected]> * Update src/diffusers/pipelines/repaint/pipeline_repaint.py Co-authored-by: Patrick von Platen <[email protected]> * Add docs and examples * Fix toctree Co-authored-by: fja <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Anton Lozhkov <[email protected]>
1 parent 4a38166 commit d38c804

File tree

13 files changed

+667
-14
lines changed

13 files changed

+667
-14
lines changed

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,5 +96,7 @@
9696
title: "Stochastic Karras VE"
9797
- local: api/pipelines/dance_diffusion
9898
title: "Dance Diffusion"
99+
- local: api/pipelines/repaint
100+
title: "RePaint"
99101
title: "Pipelines"
100102
title: "API"

docs/source/api/pipelines/overview.mdx

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -41,19 +41,21 @@ If you are looking for *official* training examples, please have a look at [exam
4141
The following table summarizes all officially supported pipelines, their corresponding paper, and if
4242
available a colab notebook to directly try them out.
4343

44-
| Pipeline | Paper | Tasks | Colab
45-
|---|---|:---:|:---:|
46-
| [ddpm](./ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
47-
| [ddim](./ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
48-
| [latent_diffusion](./latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation |
49-
| [latent_diffusion_uncond](./latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
50-
| [pndm](./pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |
51-
| [score_sde_ve](./score_sde_ve) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
52-
| [score_sde_vp](./score_sde_vp) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
53-
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
54-
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/image_2_image_using_diffusers.ipynb)
55-
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
56-
| [stochastic_karras_ve](./stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
44+
| Pipeline | Paper | Tasks | Colab
45+
|------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------:|:---:|
46+
| [ddpm](./ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
47+
| [ddim](./ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
48+
| [latent_diffusion](./latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Text-to-Image Generation |
49+
| [latent_diffusion_uncond](./latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
50+
| [pndm](./pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |
51+
| [score_sde_ve](./score_sde_ve) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
52+
| [score_sde_vp](./score_sde_vp) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
53+
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)
54+
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/image_2_image_using_diffusers.ipynb)
55+
| [stable_diffusion](./stable_diffusion) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
56+
| [stochastic_karras_ve](./stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
57+
| [repaint](./repaint) | [**RePaint: Inpainting using Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2201.09865) | Image Inpainting |
58+
5759

5860
**Note**: Pipelines are simple examples of how to play around with the diffusion systems as described in the corresponding papers.
5961

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# RePaint
14+
15+
## Overview
16+
17+
[RePaint: Inpainting using Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2201.09865) (PNDM) by Andreas Lugmayr, Martin Danelljan, Andres Romero, Fisher Yu, Radu Timofte, Luc Van Gool.
18+
19+
The abstract of the paper is the following:
20+
21+
Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image information. Since this technique does not modify or condition the original DDPM network itself, the model produces high-quality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks.
22+
RePaint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions.
23+
24+
The original codebase can be found [here](https://github.com/andreas128/RePaint).
25+
26+
## Available Pipelines:
27+
28+
| Pipeline | Tasks | Colab
29+
|-------------------------------------------------------------------------------------------------------------------------------|--------------------|:---:|
30+
| [pipeline_repaint.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/repaint/pipeline_repaint.py) | *Image Inpainting* | - |
31+
32+
## Usage example
33+
34+
```python
35+
from io import BytesIO
36+
37+
import torch
38+
39+
import PIL
40+
import requests
41+
from diffusers import RePaintPipeline, RePaintScheduler
42+
43+
44+
def download_image(url):
45+
response = requests.get(url)
46+
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
47+
48+
49+
img_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/celeba_hq_256.png"
50+
mask_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/mask_256.png"
51+
52+
# Load the original image and the mask as PIL images
53+
original_image = download_image(img_url).resize((256, 256))
54+
mask_image = download_image(mask_url).resize((256, 256))
55+
56+
# Load the RePaint scheduler and pipeline based on a pretrained DDPM model
57+
scheduler = RePaintScheduler.from_config("google/ddpm-ema-celebahq-256")
58+
pipe = RePaintPipeline.from_pretrained("google/ddpm-ema-celebahq-256", scheduler=scheduler)
59+
pipe = pipe.to("cuda")
60+
61+
generator = torch.Generator(device="cuda").manual_seed(0)
62+
output = pipe(
63+
original_image=original_image,
64+
mask_image=mask_image,
65+
num_inference_steps=250,
66+
eta=0.0,
67+
jump_length=10,
68+
jump_n_sample=10,
69+
generator=generator,
70+
)
71+
inpainted_image = output.images[0]
72+
```
73+
74+
## RePaintPipeline
75+
[[autodoc]] pipelines.repaint.pipeline_repaint.RePaintPipeline
76+
- __call__
77+

docs/source/api/schedulers.mdx

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,4 +127,14 @@ Fast scheduler which often times generates good outputs with 20-30 steps.
127127
Ancestral sampling with Euler method steps. Based on the original (k-diffusion)[https://github.com/crowsonkb/k-diffusion/blob/481677d114f6ea445aa009cf5bd7a9cdee909e47/k_diffusion/sampling.py#L72] implementation by Katherine Crowson.
128128
Fast scheduler which often times generates good outputs with 20-30 steps.
129129

130-
[[autodoc]] EulerAncestralDiscreteScheduler
130+
[[autodoc]] EulerAncestralDiscreteScheduler
131+
132+
133+
#### RePaint scheduler
134+
135+
DDPM-based inpainting scheduler for unsupervised inpainting with extreme masks.
136+
Intended for use with [`RePaintPipeline`].
137+
Based on the paper [RePaint: Inpainting using Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2201.09865)
138+
and the original implementation by Andreas Lugmayr et al.: https://github.com/andreas128/RePaint
139+
140+
[[autodoc]] RePaintScheduler

src/diffusers/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636
KarrasVePipeline,
3737
LDMPipeline,
3838
PNDMPipeline,
39+
RePaintPipeline,
3940
ScoreSdeVePipeline,
4041
)
4142
from .schedulers import (
@@ -46,6 +47,7 @@
4647
IPNDMScheduler,
4748
KarrasVeScheduler,
4849
PNDMScheduler,
50+
RePaintScheduler,
4951
SchedulerMixin,
5052
ScoreSdeVeScheduler,
5153
)

src/diffusers/pipelines/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
from .ddpm import DDPMPipeline
88
from .latent_diffusion_uncond import LDMPipeline
99
from .pndm import PNDMPipeline
10+
from .repaint import RePaintPipeline
1011
from .score_sde_ve import ScoreSdeVePipeline
1112
from .stochastic_karras_ve import KarrasVePipeline
1213
else:
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .pipeline_repaint import RePaintPipeline
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
# Copyright 2022 ETH Zurich Computer Vision Lab and The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
16+
from typing import Optional, Tuple, Union
17+
18+
import numpy as np
19+
import torch
20+
21+
import PIL
22+
from tqdm.auto import tqdm
23+
24+
from ...models import UNet2DModel
25+
from ...pipeline_utils import DiffusionPipeline, ImagePipelineOutput
26+
from ...schedulers import RePaintScheduler
27+
28+
29+
def _preprocess_image(image: PIL.Image.Image):
30+
image = np.array(image.convert("RGB"))
31+
image = image[None].transpose(0, 3, 1, 2)
32+
image = torch.from_numpy(image).to(dtype=torch.float32) / 127.5 - 1.0
33+
return image
34+
35+
36+
def _preprocess_mask(mask: PIL.Image.Image):
37+
mask = np.array(mask.convert("L"))
38+
mask = mask.astype(np.float32) / 255.0
39+
mask = mask[None, None]
40+
mask[mask < 0.5] = 0
41+
mask[mask >= 0.5] = 1
42+
mask = torch.from_numpy(mask)
43+
return mask
44+
45+
46+
class RePaintPipeline(DiffusionPipeline):
47+
unet: UNet2DModel
48+
scheduler: RePaintScheduler
49+
50+
def __init__(self, unet, scheduler):
51+
super().__init__()
52+
self.register_modules(unet=unet, scheduler=scheduler)
53+
54+
@torch.no_grad()
55+
def __call__(
56+
self,
57+
original_image: Union[torch.FloatTensor, PIL.Image.Image],
58+
mask_image: Union[torch.FloatTensor, PIL.Image.Image],
59+
num_inference_steps: int = 250,
60+
eta: float = 0.0,
61+
jump_length: int = 10,
62+
jump_n_sample: int = 10,
63+
generator: Optional[torch.Generator] = None,
64+
output_type: Optional[str] = "pil",
65+
return_dict: bool = True,
66+
) -> Union[ImagePipelineOutput, Tuple]:
67+
r"""
68+
Args:
69+
original_image (`torch.FloatTensor` or `PIL.Image.Image`):
70+
The original image to inpaint on.
71+
mask_image (`torch.FloatTensor` or `PIL.Image.Image`):
72+
The mask_image where 0.0 values define which part of the original image to inpaint (change).
73+
num_inference_steps (`int`, *optional*, defaults to 1000):
74+
The number of denoising steps. More denoising steps usually lead to a higher quality image at the
75+
expense of slower inference.
76+
eta (`float`):
77+
The weight of noise for added noise in a diffusion step. Its value is between 0.0 and 1.0 - 0.0 is DDIM
78+
and 1.0 is DDPM scheduler respectively.
79+
jump_length (`int`, *optional*, defaults to 10):
80+
The number of steps taken forward in time before going backward in time for a single jump ("j" in
81+
RePaint paper). Take a look at Figure 9 and 10 in https://arxiv.org/pdf/2201.09865.pdf.
82+
jump_n_sample (`int`, *optional*, defaults to 10):
83+
The number of times we will make forward time jump for a given chosen time sample. Take a look at
84+
Figure 9 and 10 in https://arxiv.org/pdf/2201.09865.pdf.
85+
generator (`torch.Generator`, *optional*):
86+
A [torch generator](https://pytorch.org/docs/stable/generated/torch.Generator.html) to make generation
87+
deterministic.
88+
output_type (`str`, *optional*, defaults to `"pil"`):
89+
The output format of the generate image. Choose between
90+
[PIL](https://pillow.readthedocs.io/en/stable/): `PIL.Image.Image` or `np.array`.
91+
return_dict (`bool`, *optional*, defaults to `True`):
92+
Whether or not to return a [`~pipeline_utils.ImagePipelineOutput`] instead of a plain tuple.
93+
94+
Returns:
95+
[`~pipeline_utils.ImagePipelineOutput`] or `tuple`: [`~pipelines.utils.ImagePipelineOutput`] if
96+
`return_dict` is True, otherwise a `tuple. When returning a tuple, the first element is a list with the
97+
generated images.
98+
"""
99+
100+
if not isinstance(original_image, torch.FloatTensor):
101+
original_image = _preprocess_image(original_image)
102+
original_image = original_image.to(self.device)
103+
if not isinstance(mask_image, torch.FloatTensor):
104+
mask_image = _preprocess_mask(mask_image)
105+
mask_image = mask_image.to(self.device)
106+
107+
# sample gaussian noise to begin the loop
108+
image = torch.randn(
109+
original_image.shape,
110+
generator=generator,
111+
device=self.device,
112+
)
113+
image = image.to(self.device)
114+
115+
# set step values
116+
self.scheduler.set_timesteps(num_inference_steps, jump_length, jump_n_sample, self.device)
117+
self.scheduler.eta = eta
118+
119+
t_last = self.scheduler.timesteps[0] + 1
120+
for i, t in enumerate(tqdm(self.scheduler.timesteps)):
121+
if t < t_last:
122+
# predict the noise residual
123+
model_output = self.unet(image, t).sample
124+
# compute previous image: x_t -> x_t-1
125+
image = self.scheduler.step(model_output, t, image, original_image, mask_image, generator).prev_sample
126+
127+
else:
128+
# compute the reverse: x_t-1 -> x_t
129+
image = self.scheduler.undo_step(image, t_last, generator)
130+
t_last = t
131+
132+
image = (image / 2 + 0.5).clamp(0, 1)
133+
image = image.cpu().permute(0, 2, 3, 1).numpy()
134+
if output_type == "pil":
135+
image = self.numpy_to_pil(image)
136+
137+
if not return_dict:
138+
return (image,)
139+
140+
return ImagePipelineOutput(images=image)

src/diffusers/schedulers/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
from .scheduling_ipndm import IPNDMScheduler
2525
from .scheduling_karras_ve import KarrasVeScheduler
2626
from .scheduling_pndm import PNDMScheduler
27+
from .scheduling_repaint import RePaintScheduler
2728
from .scheduling_sde_ve import ScoreSdeVeScheduler
2829
from .scheduling_sde_vp import ScoreSdeVpScheduler
2930
from .scheduling_utils import SchedulerMixin

0 commit comments

Comments
 (0)