Add better compatibility with `diffusers-interpret` (and possibly other use cases!) #506

JoaoLages · 2022-09-13T19:52:42Z

Hi there!
I love this package ❤️

I'm the author of diffusers-interpret and along my work found these features very useful to add to this main package:

Having the option to run DiffusionPipeline.__call__ while calculating gradients;
Having a output_latents flag (similar to output_scores/output_attentions/etc from transformers) that adds a latents attribute to the output;
~~Deactivating safety checker;~~ removed this option (12ca969)
Passing text_embeddings directly instead of the text string;
Gradient checkpointing (this was already a feature in transformers too).

That's about it 😄
To start this PR I made the changes only for the StableDiffusionPipeline class, but I can port those changes to the other pipelines if you agree with them.

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

…sion.py

keturn

Hi João! I was just introduced to diffusers-interpret yesterday via the discord! I have all the same questions so I love seeing that sort of thing.

I have no authority to merge anything here, but I've taken the liberty of leaving a few notes.

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

keturn · 2022-09-13T20:28:39Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

        if not return_dict:
-            return (image, has_nsfw_concept)
+            return (image, has_nsfw_concept, all_latents)


I think if PipelineOutput classes are the way forward and the tuple return format here is mainly for backwards compatibility, we should leave it the same size it was (a pair) and not worry about adding new features to it.

Since this method had a @torch.no_grad decorator, I don't think this is for backwards compatibility 🤔
But looking at the transformers package, it seems that when return_dict_generate=False, options like output_scores/output_attentions don't matter, so it makes sense to remove latents from the tuple as you mention :)

patrickvonplaten · 2022-09-14T19:00:30Z

src/diffusers/pipelines/stable_diffusion/__init__.py


    images: Union[List[PIL.Image.Image], np.ndarray]
    nsfw_content_detected: List[bool]
+    latents: Optional[List[torch.Tensor]] = None


Fine with me! What do you think @pcuenca @anton-l @patil-suraj ?

fine with me as well.

patrickvonplaten · 2022-09-14T19:02:51Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

        Args:
-            prompt (`str` or `List[str]`):
-                The prompt or prompts to guide the image generation.
+            prompt (`str`, `List[str]` or `torch.Tensor`):


Ufff, I don't think the input prompt should ever be a tensor that's confusing and opens the box for hacky code - can't we just work with the latents inputs?

What if we use inputs_embeds as some methods in transformers have?

This is what I meant: https://github.com/huggingface/transformers/blob/16913b3c9215f592f1240511f8da271dc07c3552/src/transformers/models/t5/modeling_t5.py#L905

Agree with Patrick.

The problem is that we can pass an image to StableDiffusionImg2ImgPipeline.call as a tensor but we can't pass a text...
Would and extra inputs_embeds or prompt_embeds argument work as an alternative?

patrickvonplaten · 2022-09-14T19:03:08Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

            batch_size = 1
        elif isinstance(prompt, list):
            batch_size = len(prompt)
+        elif torch.is_tensor(prompt):


I don't think prompt should ever be a tensor

patrickvonplaten · 2022-09-14T19:03:49Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py


+            if output_latents:
+                # save latents from all diffusion steps
+                all_latents.append(latents)


fine with this! Think this makes a lot of sense

patrickvonplaten · 2022-09-14T19:10:05Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

            self.to(device)

+        # enable/disable grad
+        was_grad_enabled = torch.is_grad_enabled()


This will also be a bit difficult to accept for me. It's a) a bit hacky to me and b) Pipelines by definition should only be used for inference. I assume the gradients are needed for analysis and the idea is not to do training?
I'm not 100% sure whether they are enough use cases that warrant allowing gradient flow here on the other hand it also shouldn't hurt really if we leave the default to False. IMO working with function decorators and enable_grad + disable_grad functions is the way to go here though instead.

What do you think @patil-suraj @anton-l @pcuenca ?

Same as Patrick, not really in favor of this.

IMO working with function decorators and enable_grad + disable_grad functions is the way to go here though instead.

+1

IMO working with function decorators and enable_grad + disable_grad functions is the way to go here though instead.

By that means, keeping @torch.no_grad decorator in __call__ and add put the whole method under with torch.enable_grad() if enable_grad else nullcontext(): ?
Not sure what you meant.

patrickvonplaten

Hey @JoaoLages,

Super cool interpret library btw! Like the idea of returning the latents and could also be convinced to allow gradient computation (even though I think the use case is too niche for now and would only be pro if all the community really wants this feature cc @hysts @pcuenca @patil-suraj @anton-l - wdyt?). Don't think we should allow the prompt to be a torch.Tensor.

patil-suraj

Thanks a lot for the PR @JoaoLages !

I have pretty much same comments as Patrick.

We could definitely return intermidiate latents
gradient checkpointing will be supported soon. It should not be included in pipeline like this. Pipelines are intended for inference only so it's best to avoid training related logic here.
enabling gradients: We could add this if community is really interested in it. Could you please open an issue for this ?

patil-suraj · 2022-09-16T11:27:55Z

src/diffusers/pipelines/stable_diffusion/__init__.py


    images: Union[List[PIL.Image.Image], np.ndarray]
    nsfw_content_detected: List[bool]
+    latents: Optional[List[torch.Tensor]] = None


fine with me as well.

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

patil-suraj · 2022-09-16T11:30:00Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

        Args:
-            prompt (`str` or `List[str]`):
-                The prompt or prompts to guide the image generation.
+            prompt (`str`, `List[str]` or `torch.Tensor`):


Agree with Patrick.

patil-suraj · 2022-09-16T11:35:26Z

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

            self.to(device)

+        # enable/disable grad
+        was_grad_enabled = torch.is_grad_enabled()


Same as Patrick, not really in favor of this.

IMO working with function decorators and enable_grad + disable_grad functions is the way to go here though instead.

+1

JoaoLages · 2022-09-16T13:26:20Z

gradient checkpointing will be supported soon. It should not be included in pipeline like this. Pipelines are intended for inference only so it's best to avoid training related logic here.

🚀

enabling gradients: We could add this if community is really interested in it. Could you please open an issue for this ?

There you go #529

patrickvonplaten · 2022-09-22T13:35:46Z

Closing this PR as it does too many changes at once -> happy to continue the discussion on the single PRs that were opened :-)

JoaoLages added 2 commits September 13, 2022 20:43

add awesome features 🌟

da5c092

also add latents to tuple

eee018d

JoaoLages commented Sep 13, 2022

View reviewed changes

src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py Outdated Show resolved Hide resolved

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…

275ba79

…sion.py

keturn reviewed Sep 13, 2022

View reviewed changes

add PR suggestions

259c149

patrickvonplaten reviewed Sep 14, 2022

View reviewed changes

This was referenced Sep 15, 2022

Add callback parameters for Stable Diffusion pipelines #521

Merged

include input parameters in PipelineOutput #526

Closed

remove run_safety_checker option

12ca969

patil-suraj reviewed Sep 16, 2022

View reviewed changes

JoaoLages requested review from patil-suraj and patrickvonplaten and removed request for patrickvonplaten September 16, 2022 13:18

remove gradient checkpointing

01ae240

JoaoLages requested review from patil-suraj and patrickvonplaten and removed request for patil-suraj and patrickvonplaten September 16, 2022 15:30

leszekhanusz mentioned this pull request Sep 18, 2022

Merging Stable diffusion pipelines just makes sense #551

Closed

patrickvonplaten closed this Sep 22, 2022

JoaoLages mentioned this pull request Nov 22, 2022

Update diffusers-interpret to work with the latest diffusers package (0.8.0) JoaoLages/diffusers-interpret#21

Open

PhaneeshB pushed a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023

[Stable Diffusion] Revive the tuned model (huggingface#506)

32a2ec4

Add better compatibility with diffusers-interpret (and possibly other use cases!) #506

Add better compatibility with diffusers-interpret (and possibly other use cases!) #506

Uh oh!

Conversation

JoaoLages commented Sep 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

keturn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JoaoLages commented Sep 16, 2022

Uh oh!

patrickvonplaten commented Sep 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add better compatibility with `diffusers-interpret` (and possibly other use cases!) #506

Add better compatibility with `diffusers-interpret` (and possibly other use cases!) #506

JoaoLages commented Sep 13, 2022 •

edited

Loading