You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+96-17Lines changed: 96 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,44 +64,54 @@ In order to get started, we recommend taking a look at two notebooks:
64
64
- The [Training a diffusers model](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb)[](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/training_example.ipynb) notebook summarizes diffusion models training methods. This notebook takes a step-by-step approach to training your
65
65
diffusion models on an image dataset, with explanatory graphics.
66
66
67
-
## **New**Stable Diffusion is now fully compatible with `diffusers`!
67
+
## Stable Diffusion is fully compatible with `diffusers`!
68
68
69
-
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/) and [LAION](https://laion.ai/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM.
69
+
Stable Diffusion is a text-to-image latent diffusion model created by the researchers and engineers from [CompVis](https://github.com/CompVis), [Stability AI](https://stability.ai/), [LAION](https://laion.ai/) and [RunwayML](https://runwayml.com/). It's trained on 512x512 images from a subset of the [LAION-5B](https://laion.ai/blog/laion-5b/) database. This model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 4GB VRAM.
70
70
See the [model card](https://huggingface.co/CompVis/stable-diffusion) for more information.
71
71
72
-
You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/CompVis/stable-diffusion-v1-4), read the license and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation.
72
+
You need to accept the model license before downloading or using the Stable Diffusion weights. Please, visit the [model card](https://huggingface.co/runwayml/stable-diffusion-v1-5), read the license carefully and tick the checkbox if you agree. You have to be a registered user in 🤗 Hugging Face Hub, and you'll also need to use an access token for the code to work. For more information on access tokens, please refer to [this section](https://huggingface.co/docs/hub/security-tokens) of the documentation.
73
73
74
74
75
75
### Text-to-Image generation with Stable Diffusion
Run this command to log in with your HF Hub token if you haven't before (you can skip this step if you prefer to run the model locally, follow [this](#running-the-model-locally) instead)
83
+
```bash
84
+
huggingface-cli login
85
+
```
86
+
77
87
We recommend using the model in [half-precision (`fp16`)](https://pytorch.org/blog/accelerating-training-on-nvidia-gpus-with-pytorch-automatic-mixed-precision/) as it gives almost always the same results as full
78
88
precision while being roughly twice as fast and requiring half the amount of GPU RAM.
79
89
80
90
```python
81
-
# make sure you're logged in with `huggingface-cli login`
If you are limited by TPU memory, please make sure to load the `FlaxStableDiffusionPipeline` in `bfloat16` precision instead of the default `float32` precision as done above. You can do so by telling diffusers to load the weights from "bf16" branch.
loadedintothepipelines.Morespecifically,foreachmodel/componentoneneedstodefinetheformat`<name>:["<library>","<class name>"]`.`<name>`istheattributenamegiventotheloadedinstanceof`<classname>` which can be found in the library or pipeline folder called `"<library>"`.
73
73
-[`save_pretrained`](../diffusion_pipeline) that accepts a local path, *e.g.*`./stable-diffusion` under which all models/components of the pipeline will be saved. For each component/model a folder is created inside the local path that is named after the given attribute name, *e.g.*`./stable_diffusion/unet`.
74
74
In addition, a `model_index.json` file is created at the root of the local path, *e.g.*`./stable_diffusion/model_index.json` so that the complete pipeline can again be instantiated
@@ -100,7 +100,7 @@ logic including pre-processing, an unrolled diffusion loop, and post-processing
100
100
# make sure you're logged in with `huggingface-cli login`
101
101
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
0 commit comments