-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[LoRA] Add LoRA training script #1884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
4eb297e
[Lora] first upload
patrickvonplaten 67f4e5a
add first lora version
patrickvonplaten 24993c4
upload
patrickvonplaten 943e7f4
more
patrickvonplaten e7293d0
first training
patrickvonplaten 0baadb1
Merge branch 'main' of https://github.com/huggingface/diffusers into …
patrickvonplaten b8e9ce4
up
patrickvonplaten f7719e0
correct
patrickvonplaten b69f276
improve
patrickvonplaten 5d6ee56
finish loaders and inference
patrickvonplaten bc15289
up
patrickvonplaten aa8ad74
fix
patrickvonplaten d334d5a
up
patrickvonplaten d8f1a6b
fix more
patrickvonplaten c5cf0a0
up
patrickvonplaten 060697e
finish more
patrickvonplaten 5d5fd77
finish more
patrickvonplaten b7478ef
up
patrickvonplaten 1530b76
up
patrickvonplaten 17850de
change year
patrickvonplaten fb8ce5f
revert year change
patrickvonplaten cbe6ef7
Change lines
patrickvonplaten b4bcc26
Add cloneofsimo as co-author.
patrickvonplaten 3d693c0
finish
patrickvonplaten f53d962
fix docs
patrickvonplaten 5def85c
Apply suggestions from code review
patrickvonplaten 6f8f610
upload
patrickvonplaten d137d62
Merge branch 'add_lora_fine_tuning' of https://github.com/huggingface…
patrickvonplaten dd60ad8
finish
patrickvonplaten File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| <!--Copyright 2022 The HuggingFace Team. All rights reserved. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
| the License. You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
| an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
| specific language governing permissions and limitations under the License. | ||
| --> | ||
|
|
||
| # Loaders | ||
|
|
||
| There are many weights to train adapter neural networks for diffusion models, such as | ||
| - [Textual Inversion](./training/text_inversion.mdx) | ||
| - [LoRA](https://github.com/cloneofsimo/lora) | ||
| - [Hypernetworks](https://arxiv.org/abs/1609.09106) | ||
|
|
||
| Such adapter neural networks often only consist of a fraction of the number of weights compared | ||
| to the pretrained model and as such are very portable. The Diffusers library offers an easy-to-use | ||
| API to load such adapter neural networks via the [`loaders.py` module](https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders.py). | ||
|
|
||
| **Note**: This module is still highly experimental and prone to future changes. | ||
|
|
||
| ## LoaderMixins | ||
|
|
||
| ### UNet2DConditionLoadersMixin | ||
|
|
||
| [[autodoc]] loaders.UNet2DConditionLoadersMixin |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,6 +5,7 @@ The `train_dreambooth.py` script shows how to implement the training procedure a | |
|
|
||
|
|
||
| ## Running locally with PyTorch | ||
|
|
||
| ### Installing the dependencies | ||
|
|
||
| Before running the scripts, make sure to install the library's training dependencies: | ||
|
|
@@ -235,6 +236,102 @@ image.save("dog-bucket.png") | |
|
|
||
| You can also perform inference from one of the checkpoints saved during the training process, if you used the `--checkpointing_steps` argument. Please, refer to [the documentation](https://huggingface.co/docs/diffusers/main/en/training/dreambooth#performing-inference-using-a-saved-checkpoint) to see how to do it. | ||
|
|
||
| ## Training with Low-Rank Adaptation of Large Language Models (LoRA) | ||
|
|
||
| Low-Rank Adaption of Large Language Models was first introduced by Microsoft in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685) by *Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen* | ||
|
|
||
| In a nutshell, LoRA allows to adapt pretrained models by adding pairs of rank-decomposition matrices to existing weights and **only** training those newly added weights. This has a couple of advantages: | ||
| - Previous pretrained weights are kept frozen so that the model is not prone to [catastrophic forgetting](https://www.pnas.org/doi/10.1073/pnas.1611835114) | ||
| - Rank-decomposition matrices have significantly fewer parameters than the original model, which means that trained LoRA weights are easily portable. | ||
| - LoRA attention layers allow to control to which extent the model is adapted torwards new training images via a `scale` parameter. | ||
|
|
||
| [cloneofsimo](https://github.com/cloneofsimo) was the first to try out LoRA training for Stable Diffusion in | ||
| the popular [lora](https://github.com/cloneofsimo/lora) GitHub repository. | ||
|
|
||
| ### Training | ||
|
|
||
| Let's get started with a simple example. We will re-use the dog example of the [previous section](#dog-toy-example). | ||
|
|
||
| First, you need to set-up your dreambooth training example as is explained in the [installation section](#Installing-the-dependencies). | ||
| Next, let's download the dog dataset. Download images from [here](https://drive.google.com/drive/folders/1BO_dyz-p65qhBRRMRA4TbZ8qW4rB99JZ) and save them in a directory. Make sure to set `INSTANCE_DIR` to the name of your directory further below. This will be our training data. | ||
|
|
||
| Now, you can launch the training. Here we will use [Stable Diffusion 1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5). | ||
|
|
||
| **___Note: Change the `resolution` to 768 if you are using the [stable-diffusion-2](https://huggingface.co/stabilityai/stable-diffusion-2) 768x768 model.___** | ||
|
|
||
| **___Note: It is quite useful to monitor the training progress by regularly generating sample images during training. [wandb](https://docs.wandb.ai/quickstart) is a nice solution to easily see generating images during training. All you need to do is to run `pip install wandb` before training and pass `--report_to="wandb"` to automatically log images.___** | ||
|
|
||
|
|
||
| ```bash | ||
| export MODEL_NAME="runwayml/stable-diffusion-v1-5" | ||
| export INSTANCE_DIR="path-to-instance-images" | ||
| export OUTPUT_DIR="path-to-save-model" | ||
| ``` | ||
|
|
||
| For this example we want to directly store the trained LoRA embeddings on the Hub, so | ||
| we need to be logged in and add the `--push_to_hub` flag. | ||
|
|
||
| ```bash | ||
| huggingface-cli login | ||
| ``` | ||
|
|
||
| Now we can start training! | ||
|
|
||
| ```bash | ||
| accelerate launch train_dreambooth_lora.py \ | ||
| --pretrained_model_name_or_path=$MODEL_NAME \ | ||
| --instance_data_dir=$INSTANCE_DIR \ | ||
| --output_dir=$OUTPUT_DIR \ | ||
| --instance_prompt="a photo of sks dog" \ | ||
| --resolution=512 \ | ||
| --train_batch_size=1 \ | ||
| --gradient_accumulation_steps=1 \ | ||
| --checkpointing_steps=100 \ | ||
| --learning_rate=1e-4 \ | ||
| --report_to="wandb" \ | ||
| --lr_scheduler="constant" \ | ||
| --lr_warmup_steps=0 \ | ||
| --max_train_steps=500 \ | ||
| --validation_prompt="A photo of sks dog in a bucket" \ | ||
| --validation_epochs=50 \ | ||
| --seed="0" \ | ||
| --push_to_hub | ||
| ``` | ||
|
|
||
| **___Note: When using LoRA we can use a much higher learning rate compared to vanilla dreambooth. Here we | ||
| use *1e-4* instead of the usual *2e-6*.___** | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 Perfect |
||
|
|
||
| The final LoRA embedding weights have been uploaded to [patrickvonplaten/lora_dreambooth_dog_example](https://huggingface.co/patrickvonplaten/lora_dreambooth_dog_example). **___Note: [The final weights](https://huggingface.co/patrickvonplaten/lora/blob/main/pytorch_attn_procs.bin) are only 3 MB in size which is orders of magnitudes smaller than the original model.** | ||
|
|
||
| The training results are summarized [here](https://api.wandb.ai/report/patrickvonplaten/xm6cd5q5). | ||
| You can use the `Step` slider to see how the model learned the features of our subject while the model trained. | ||
|
|
||
| ### Inference | ||
|
|
||
| After training, LoRA weights can be loaded very easily into the original pipeline. First, you need to | ||
| load the original pipeline: | ||
|
|
||
| ```python | ||
| from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler | ||
| import torch | ||
|
|
||
| pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16) | ||
| pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config) | ||
| pipe.to("cuda") | ||
| ``` | ||
|
|
||
| Next, we can load the adapter layers into the UNet with the [`load_attn_procs` function](https://huggingface.co/docs/diffusers/api/loaders#diffusers.loaders.UNet2DConditionLoadersMixin.load_attn_procs). | ||
|
|
||
| ```python | ||
| pipe.load_attn_procs("patrickvonplaten/lora") | ||
| ``` | ||
|
|
||
| Finally, we can run the model in inference. | ||
|
|
||
| ```python | ||
| image = pipe("A picture of a sks dog in a bucket", num_inference_steps=25).images[0] | ||
| ``` | ||
|
|
||
| ## Training with Flax/JAX | ||
|
|
||
| For faster training on TPUs and GPUs you can leverage the flax training example. Follow the instructions above to get the model and dataset before running the script. | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.