Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
aac7961
Controlnet training code initial commit
Ttl Feb 27, 2023
6985732
Script for adding a controlnet to existing model
Ttl Feb 28, 2023
da73a3e
Fix control image transform
Ttl Mar 1, 2023
6501ca2
Add license header and remove more unused configs
Ttl Mar 2, 2023
62fdaf8
controlnet training readme
Ttl Mar 3, 2023
7a92d6f
Allow nonlocal model in add_controlnet.py
Ttl Mar 3, 2023
fcbe52e
Formatting
Ttl Mar 3, 2023
87e67bc
Remove unused code
Ttl Mar 3, 2023
391112e
Code quality
Ttl Mar 3, 2023
6f34077
Initialize controlnet in training script
Ttl Mar 3, 2023
c375d7b
Formatting
Ttl Mar 3, 2023
c3a9abb
Address review comments
Ttl Mar 7, 2023
388a28d
doc style
Ttl Mar 7, 2023
089d3a8
Merge branch 'main' into controlnet_train
williamberman Mar 7, 2023
1c45006
explicit constructor args and submodule names
williamberman Mar 7, 2023
9e5760b
hub dataset
williamberman Mar 7, 2023
d932d61
empty prompts
williamberman Mar 7, 2023
0e30636
add conditioning image
williamberman Mar 7, 2023
173b0a4
rename
williamberman Mar 7, 2023
3ac81ea
remove instance data dir
williamberman Mar 8, 2023
43b56bb
Merge branch 'main' into controlnet_train
williamberman Mar 8, 2023
3a672ea
image_transforms -> -1,1 . conditioning_image_transformers -> 0, 1
williamberman Mar 8, 2023
e989e9b
nits
williamberman Mar 8, 2023
5452d79
remove local rank config
williamberman Mar 8, 2023
0d0c3a5
validation images
williamberman Mar 9, 2023
40dad23
Merge branch 'main' into controlnet_train
williamberman Mar 9, 2023
2ac68da
proportion_empty_prompts typo
williamberman Mar 10, 2023
eaf8c3a
weight copying to controlnet bug
williamberman Mar 10, 2023
b7d1b15
call log validation fix
williamberman Mar 10, 2023
7b4baea
fix
williamberman Mar 10, 2023
035f664
gitignore wandb
williamberman Mar 10, 2023
b0049a3
fix progress bar and resume from checkpoint iteration
williamberman Mar 10, 2023
f3caf3e
initial step fix
williamberman Mar 11, 2023
e90b04e
log multiple images
williamberman Mar 11, 2023
26e12e2
fix
williamberman Mar 11, 2023
cec5749
fixes
williamberman Mar 11, 2023
124dac4
tracker project name configurable
williamberman Mar 12, 2023
0ce5b4c
misc
williamberman Mar 12, 2023
545cd38
Merge branch 'main' into controlnet_train
williamberman Mar 13, 2023
4c51da9
add controlnet requirements.txt
williamberman Mar 13, 2023
c540085
update docs
williamberman Mar 13, 2023
768df18
image labels
williamberman Mar 13, 2023
3590e4b
small fixes
williamberman Mar 14, 2023
3f405ac
log validation using existing models for pipeline
williamberman Mar 14, 2023
c7c4857
fix for deepspeed saving
williamberman Mar 14, 2023
fc4f089
memory usage docs
williamberman Mar 14, 2023
0f7b132
Update examples/controlnet/train_controlnet.py
williamberman Mar 14, 2023
c8514ba
Update examples/controlnet/train_controlnet.py
williamberman Mar 14, 2023
3b06caf
Update examples/controlnet/README.md
williamberman Mar 14, 2023
7e72a13
Update examples/controlnet/README.md
williamberman Mar 14, 2023
a4c78ff
Update examples/controlnet/README.md
williamberman Mar 14, 2023
49988a2
Update examples/controlnet/README.md
williamberman Mar 14, 2023
e15d274
Update examples/controlnet/README.md
williamberman Mar 14, 2023
7bd26d7
Update examples/controlnet/README.md
williamberman Mar 14, 2023
8f90e36
Update examples/controlnet/README.md
williamberman Mar 14, 2023
ed140aa
Update examples/controlnet/README.md
williamberman Mar 14, 2023
e10289f
remove extra is main process check
williamberman Mar 14, 2023
602cc02
link to dataset in intro paragraph
williamberman Mar 14, 2023
71a7936
remove unnecessary paragraph
williamberman Mar 14, 2023
820aa23
note on deepspeed
williamberman Mar 14, 2023
011ca31
Update examples/controlnet/README.md
williamberman Mar 14, 2023
2dce652
assert -> value error
williamberman Mar 14, 2023
9e87526
weights and biases note
williamberman Mar 14, 2023
6fd13ea
move images out of git
williamberman Mar 14, 2023
cb49716
remove .gitignore
williamberman Mar 14, 2023
a729cb8
Merge branch 'main' into controlnet_train
williamberman Mar 14, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -172,3 +172,5 @@ tags

# ruff
.ruff_cache

wandb
269 changes: 269 additions & 0 deletions examples/controlnet/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
# ControlNet training example

[Adding Conditional Control to Text-to-Image Diffusion Models](https://arxiv.org/abs/2302.05543) by Lvmin Zhang and Maneesh Agrawala.

This example is based on the [training example in the original ControlNet repository](https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md). It trains a ControlNet to fill circles using a [small synthetic dataset](https://huggingface.co/datasets/fusing/fill50k).

## Installing the dependencies

Before running the scripts, make sure to install the library's training dependencies:

**Important**

To make sure you can successfully run the latest versions of the example scripts, we highly recommend **installing from source** and keeping the install up to date as we update the example scripts frequently and install some example-specific requirements. To do this, execute the following steps in a new virtual environment:
```bash
git clone https://github.com/huggingface/diffusers
cd diffusers
pip install -e .
```

Then cd in the example folder and run
```bash
pip install -r requirements.txt
```

And initialize an [🤗Accelerate](https://github.com/huggingface/accelerate/) environment with:

```bash
accelerate config
```

Or for a default accelerate configuration without answering questions about your environment

```bash
accelerate config default
```

Or if your environment doesn't support an interactive shell e.g. a notebook

```python
from accelerate.utils import write_basic_config
write_basic_config()
```

## Circle filling dataset

The original dataset is hosted in the [ControlNet repo](https://huggingface.co/lllyasviel/ControlNet/blob/main/training/fill50k.zip). We re-uploaded it to be compatible with `datasets` [here](https://huggingface.co/datasets/fusing/fill50k). Note that `datasets` handles dataloading within the training script.

Our training examples use [Stable Diffusion 1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5) as the original set of ControlNet models were trained from it. However, ControlNet can be trained to augment any Stable Diffusion compatible model (such as [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4)) or [stabilityai/stable-diffusion-2-1](https://huggingface.co/stabilityai/stable-diffusion-2-1).

## Training

Our training examples use two test conditioning images. They can be downloaded by running

```sh
wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png

wget https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png
```


```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"

accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=4
```

This default configuration requires ~38GB VRAM.

By default, the training script logs outputs to tensorboard. Pass `--report_to wandb` to use weights and
biases.

Gradient accumulation with a smaller batch size can be used to reduce training requirements to ~20 GB VRAM.

```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"

accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4
```

## Example results

#### After 300 steps with batch size 8

| | |
|-------------------|:-------------------------:|
| | red circle with blue background |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png) | ![red circle with blue background](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/red_circle_with_blue_background_300_steps.png) |
| | cyan circle with brown floral background |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png) | ![cyan circle with brown floral background](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/cyan_circle_with_brown_floral_background_300_steps.png) |


#### After 6000 steps with batch size 8:

| | |
|-------------------|:-------------------------:|
| | red circle with blue background |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_1.png) | ![red circle with blue background](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/red_circle_with_blue_background_6000_steps.png) |
| | cyan circle with brown floral background |
![conditioning image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/conditioning_image_2.png) | ![cyan circle with brown floral background](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/controlnet_training/cyan_circle_with_brown_floral_background_6000_steps.png) |

## Training on a 16 GB GPU

Optimizations:
- Gradient checkpointing
- bitsandbyte's 8-bit optimizer

[bitandbytes install instructions](https://github.com/TimDettmers/bitsandbytes#requirements--installation).

```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"

accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--use_8bit_adam
```

## Training on a 12 GB GPU

Optimizations:
- Gradient checkpointing
- bitsandbyte's 8-bit optimizer
- xformers
- set grads to none

```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"

accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--learning_rate=1e-5 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--use_8bit_adam \
--enable_xformers_memory_efficient_attention \
--set_grads_to_none
```

When using `enable_xformers_memory_efficient_attention`, please make sure to install `xformers` by `pip install xformers`.

## Training on an 8 GB GPU

We have not exhaustively tested DeepSpeed support for ControlNet. While the configuration does
save memory, we have not confirmed the configuration to train successfully. You will very likely
have to make changes to the config to have a successful training run.

Optimizations:
- Gradient checkpointing
- xformers
- set grads to none
- DeepSpeed stage 2 with parameter and optimizer offloading
- fp16 mixed precision

[DeepSpeed](https://www.deepspeed.ai/) can offload tensors from VRAM to either
CPU or NVME. This requires significantly more RAM (about 25 GB).

Use `accelerate config` to enable DeepSpeed stage 2.

The relevant parts of the resulting accelerate config file are

```yaml
compute_environment: LOCAL_MACHINE
deepspeed_config:
gradient_accumulation_steps: 4
offload_optimizer_device: cpu
offload_param_device: cpu
zero3_init_flag: false
zero_stage: 2
distributed_type: DEEPSPEED
```

See [documentation](https://huggingface.co/docs/accelerate/usage_guides/deepspeed) for more DeepSpeed configuration options.

Changing the default Adam optimizer to DeepSpeed's Adam
`deepspeed.ops.adam.DeepSpeedCPUAdam` gives a substantial speedup but
it requires CUDA toolchain with the same version as pytorch. 8-bit optimizer
does not seem to be compatible with DeepSpeed at the moment.

```bash
export MODEL_DIR="runwayml/stable-diffusion-v1-5"
export OUTPUT_DIR="path to save model"

accelerate launch train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_DIR \
--output_dir=$OUTPUT_DIR \
--dataset_name=fusing/fill50k \
--resolution=512 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--gradient_checkpointing \
--enable_xformers_memory_efficient_attention \
--set_grads_to_none \
--mixed_precision fp16
```

## Performing inference with the trained ControlNet

The trained model can be run the same as the original ControlNet pipeline with the newly trained ControlNet.
Set `base_model_path` and `controlnet_path` to the values `--pretrained_model_name_or_path` and
`--output_dir` were respectively set to in the training script.

```py
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
from diffusers.utils import load_image
import torch

base_model_path = "path to model"
controlnet_path = "path to controlnet"

controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
base_model_path, controlnet=controlnet, torch_dtype=torch.float16
)

# speed up diffusion process with faster scheduler and memory optimization
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
# remove following line if xformers is not installed
pipe.enable_xformers_memory_efficient_attention()

pipe.enable_model_cpu_offload()

control_image = load_image("./conditioning_image_1.png")
prompt = "pale golden rod circle with old lace background"

# generate image
generator = torch.manual_seed(0)
image = pipe(
prompt, num_inference_steps=20, generator=generator, image=control_image
).images[0]

image.save("./output.png")
```
6 changes: 6 additions & 0 deletions examples/controlnet/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
accelerate
torchvision
transformers>=4.25.1
ftfy
tensorboard
datasets
Loading