Skip to content

[enhancement]: OOM error during VAE decode #2672

@psychedelicious

Description

@psychedelicious

Is there an existing issue for this?

  • I have searched the existing issues

OS

Linux

GPU

cuda

VRAM

24GB

What happened?

OOM error during VAE decode. VRAM usage skyrockets from ~15.5GB to ~40GB during decode for a 3072x3072 image.

Can we mitigate this somehow?

I tried calling self.enable-vae-slicing() in the constructor for StableDiffusionGeneratorPipeline but the numbers stayed the same.

>> Image Generation Parameters:

{'prompt': 'pizza', 'iterations': 3, 'steps': 3, 'cfg_scale': 7.5, 'threshold': 0, 'perlin': 0, 'height': 3072, 'width': 3072, 'sampler_name': 'k_lms', 'seed': 3471489041, 'progress_images': False, 'progress_latents': True, 'save_intermediates': 5, 'generation_mode': 'txt2img', 'init_mask': '...', 'hires_fix': False, 'seamless': False, 'variation_amount': 0}

>> ESRGAN Parameters: False
>> Facetool Parameters: False
100%|█████████████████████████████████████████████████████████| 3/3 [00:16<00:00,  5.58s/it]
Generating:   0%|                                                     | 0/3 [00:19<?, ?it/s]
Traceback (most recent call last):
  File "/home/bat/Documents/Code/InvokeAI/ldm/generate.py", line 517, in prompt2image
    results = generator.generate(
  File "/home/bat/Documents/Code/InvokeAI/ldm/invoke/generator/base.py", line 112, in generate
    image = make_image(x_T)
  File "/home/bat/Documents/Code/InvokeAI/ldm/invoke/generator/txt2img.py", line 40, in make_image
    pipeline_output = pipeline.image_from_embeddings(
  File "/home/bat/Documents/Code/InvokeAI/ldm/invoke/generator/diffusers_pipeline.py", line 365, in image_from_embeddings
    image = self.decode_latents(result_latents)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 370, in decode_latents
    image = self.vae.decode(latents).sample
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 144, in decode
    decoded = self._decode(z).sample
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 116, in _decode
    dec = self.decoder(z)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/vae.py", line 188, in forward
    sample = up_block(sample)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py", line 1718, in forward
    hidden_states = upsampler(hidden_states)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/resnet.py", line 139, in forward
    hidden_states = self.conv(hidden_states)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/home/bat/invokeai/.venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 40.50 GiB (GPU 0; 23.65 GiB total capacity; 14.41 GiB already allocated; 5.59 GiB free; 15.71 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

>> Could not generate image.

>> Usage stats:
>>   0 image(s) generated in 19.64s
>>   Max VRAM used for this generation: 15.47G. Current VRAM utilization: 10.64G
>>   Max VRAM used since script start:  15.47G

Screenshots

No response

Additional context

No response

Contact Details

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions