-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Closed
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates
Description
Describe the bug
AutoEncoderKL encoder loaded from runwayml/stable-diffusion-v1-5 outputs NaN for large images. I observe this behavior for image sizes starting from around 1500x1500 with vae_tiling disabled. I tried with float32, float16, with and without xFormers. Is it an expected behavior ?
I would have liked to use vae_tiling but it produces tiles artefacts as reported in #1441.
Reproduction
from PIL import Image
import torch
import urllib.request
import os
from diffusers import StableDiffusionImg2ImgPipeline
from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_img2img import preprocess
image_file = urllib.request.urlopen("https://upload.wikimedia.org/wikipedia/commons/3/32/A_photograph_of_an_astronaut_riding_a_horse_2022-08-28.png")
init_image = Image.open(image_file)
up_size = (2048, 2048)
upsampled_image = init_image.resize(up_size, Image.Resampling.BILINEAR)
pipe = StableDiffusionImg2ImgPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float32)
pipe.enable_xformers_memory_efficient_attention()
device = "cuda"
with torch.no_grad():
vae = pipe.vae.to(device)
vae.disable_tiling()
preprocessed = preprocess(upsampled_image).to(torch.float32).to(device)
latents = vae.encode(preprocessed).latent_dist.sample()
print(latents)
Logs
tensor([[[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],
[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]]], device='cuda:0')System Info
diffusersversion: 0.14.0- Platform: Linux-5.15.0-1028-aws-x86_64-with-glibc2.35
- Python version: 3.10.6
- PyTorch version (GPU?): 2.0.0+cu117 (True)
- Huggingface_hub version: 0.13.3
- Transformers version: 4.27.4
- Accelerate version: 0.15.0
- xFormers version: 0.0.18
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleIssues that haven't received updatesIssues that haven't received updates