Skip to content

Conversation

@patil-suraj
Copy link
Contributor

@patil-suraj patil-suraj commented Oct 5, 2022

To generate multiple images for a prompt currently we need to repeat the prompts before calling the pipeline.

pipe = StableDiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4", 
    revision="fp16", 
    torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
images = pipe([prompt] * 2).images

Because of this the text embeddings and uncond embeddings are computed multiple times for the same prompt.

This PR adds num_images_per_prompt argument to stable diffusion pipelines to allow returning multiple images per prompt without repeating them. With this the text embeddings and uncond embeddings for each prompt are computed once and repeated according to the value of num_images_per_prompt.

prompt = "a photo of an astronaut riding a horse on mars"
images = pipe(prompt, num_images_per_prompt=2).images
assert len(images) == 2

Thanks @NouamaneTazi !

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 5, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

We should also do it in the flax version for faster inference, but we can do so later.

Copy link
Member

@NouamaneTazi NouamaneTazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing PR, thank you for taking care of this 🙏🏼

uncond_embeddings = self.text_encoder(uncond_input.input_ids.to(self.device))[0]

# duplicate unconditional embeddings for each generation per prompt
uncond_embeddings = uncond_embeddings.repeat_interleave(batch_size * num_images_per_prompt, dim=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last time I checked I had issues with repeat_interleave and ONNX. We probably just need to require a specific opset version though.
Can you please double-check @anton-l?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will check that. But this PR doesn't modify onnx pipeline, so it should be fine.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@rvorias
Copy link

rvorias commented Oct 13, 2022

In the current implementation, num_images_per_prompt is just a batch_size scaler.

e.g. if you want 9 images, the current implementation multiplies the batch size by 9. Good luck with fitting that on a consumer GPU 😁

When I saw this arg I was expecting it to have sequential pipeline runs (with just the same prompt). A lot of other repos have it like this.

Extra kudos: when num_images_per_prompt > 1, also allow a list of seeds intake.

@patrickvonplaten
Copy link
Contributor

We need indeed some good warning / error throwing system here

@rvorias
Copy link

rvorias commented Oct 14, 2022

We need indeed some good warning / error throwing system here

Hmm I'm not sure this has anything to do with warning or errors. Just pointing out that -from a user perspective- batch_size and num_images_per_prompt are the same.

@patrickvonplaten
Copy link
Contributor

Yeah but batch_size cannot be passed in stable diffusion pipeline as an input argument no?

@patil-suraj
Copy link
Contributor Author

Thanks for the comment @rvorias ! It would be bit complicated to run pipe in sequence inside the pipe, the users can do it very easily, by just calling the pipeline in a loop, this way you as a users have full control over it. We'll document this better in examples, docs :)

prathikr pushed a commit to prathikr/diffusers that referenced this pull request Oct 26, 2022
* compute text embeds per prompt

* don't repeat uncond prompts

* repeat separatly

* update image2image

* fix repeat uncond embeds

* adapt inpaint pipeline

* ifx uncond tokens in img2img

* add tests and fix ucond embeds in im2img and inpaint pipe
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* compute text embeds per prompt

* don't repeat uncond prompts

* repeat separatly

* update image2image

* fix repeat uncond embeds

* adapt inpaint pipeline

* ifx uncond tokens in img2img

* add tests and fix ucond embeds in im2img and inpaint pipe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants