Reduce Stable Diffusion memory usage by keeping unet only on GPU.

**Is your feature request related to a problem? Please describe.**
Stable Diffusion is not compute heavy on all its steps. If we keep the diffusion unet on fp16 on GPU and everything else on CPU, we could reduce the GPU usage to 2.2GB while having a non-so-big impact on performance. It should democratize Stable Diffusion even further.

Only other thing that would need to be done is move the tensors from the devices accordingly, but we can use the models `device`  and `dtype` attributes to make everything work.

**Describe the solution you'd like**
I think what I'm proposing on https://github.com/huggingface/diffusers/pull/537 should be enough.

**Describe alternatives you've considered**
Alternative is to use GPUs for the whole process and pay more for it.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce Stable Diffusion memory usage by keeping unet only on GPU. #540

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Reduce Stable Diffusion memory usage by keeping unet only on GPU. #540

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions