-
Notifications
You must be signed in to change notification settings - Fork 6.5k
[SD Img2Img] resize source images to multiple of 8 instead of 32 #1571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SD Img2Img] resize source images to multiple of 8 instead of 32 #1571
Conversation
…f 8 instead of 32
|
The documentation is not available anymore as the PR was closed or merged. |
|
|
patrickvonplaten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this makes a lot of sense! Could you maybe add one test here? :-)
|
Sure! Should I make an fp16 version of the test as well, or fp32 only would be enough? |
|
ONNX img2img pipeline is actually failing when I try to use image that is divisible by 8 but not 16 or 32: I'm not really familiar with ONNX but I'll try to investigate. |
|
Actually, it looks like ONNX pipeline can't even work with resolutions that are multiples of 32, only 64 are supported. This code uses the init image that is a multiple of 32 but still throws an error that is similar to the one that I've shared in the previous message: import numpy as np
import onnxruntime as ort
from diffusers import OnnxStableDiffusionImg2ImgPipeline
from diffusers.utils import load_image
gpu_provider = (
"CUDAExecutionProvider",
{
"gpu_mem_limit": "15000000000", # 15GB
"arena_extend_strategy": "kSameAsRequested",
},
)
gpu_options = ort.SessionOptions()
gpu_options.enable_mem_pattern = False
init_image = load_image(
"https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main"
"/img2img/sketch-mountains-input.jpg"
)
init_image = init_image.resize((512 - 32, 512)) # multiple of 32 but not 64
pipe = OnnxStableDiffusionImg2ImgPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="onnx",
provider=gpu_provider,
sess_options=gpu_options,
)
pipe.set_progress_bar_config(disable=None)
prompt = "A fantasy landscape, trending on artstation"
generator = np.random.RandomState(0)
output = pipe(
prompt=prompt,
image=init_image,
strength=0.75,
guidance_scale=7.5,
num_inference_steps=10,
generator=generator,
output_type="np",
)This applies to text2image too. Could it be related to the way that the model is being exported to ONNX format? torch.onnx.export() docs are saying that it doesn't preserve dynamic control flow when being exported from |
|
Could we for now apply this fix only to |
|
Totally fine to not add the changes to ONNX! Just could we please add one test that shows how to do img2img wit ha multiple of 8? |
|
Sure, will do later this week :) |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
Gently ping @vvsotnikov , happy to assign the PR to myself if you're busy :-) |
|
@patrickvonplaten sorry for the delay, and thanks for reminding :) I'd be glad to finish this PR today or tomorrow, although it seems like I don't have permissions to reassign this back to myself 🤔 |
# Conflicts: # src/diffusers/pipelines/alt_diffusion/pipeline_alt_diffusion_img2img.py # src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py # src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py
|
@patrickvonplaten I've added the tests, however, Also unsure why paint-by-example tests are failing - I haven't changed anything related to this pipeline, and these tests are green when I run them locally. |
|
cc @anton-l for ONNX. Hmm quite surprised that the PaintByExample tests are failing here as those pipelines aren't touched. Fixed ONNX for now by adding " with 8->64", think that's fine |
|
Ok tests are now all passing, not sure what was going on there. Also couldn't reproduce test failures locally -> merging! Thanks a lot for the PR @vvsotnikov ❤️ This should be very useful for the community! |
|
Glad to help! |
…gingface#1571) * [Stable Diffusion Img2Img] resize source images to integer multiple of 8 instead of 32 * [Alt Diffusion Img2Img] resize source images to multiple of 8 instead of 32 * [Img2Img] fix AltDiffusion Img2Img resolution test * [Img2Img] add Stable Diffusion Img2Img resolution test * [Cycle Diffusion] round resolution to multiplies of 8 instead of 32 * [ONNX SD Img2Img] round resolution to multiplies of 64 instead of 32 * [SD Depth2Img] round resolution to multiplies of 8 instead of 32 * [Repaint] round resolution to multiplies of 8 instead of 32 * fix make style Co-authored-by: Patrick von Platen <[email protected]>
Since #505 is merged, the resolution requirements for img2img are relaxed and could be a multiple of 8.
Sample code:
The result before the fix is resized down to 768*480:

The result after the fix preserves the original 768*504 resolution:

This change doesn't break the tests but could hurt some reproducibility as the latents' shape is different now.