[refactor] Making the xformers mem-efficient attention activation recursive #1493

blefaudeux · 2022-11-30T15:53:44Z

move the enable/disable call to being part of the base DiffusionPipeline (removes a bunch of duplicates)
make the call recursive across all the modules in the model graph, so that exposing set_use_memory_efficient_attention_xformers in a leaf module is all it takes for it to be picked up (important for some pipelines, like superres, which are not properly covered right now - see for instance simplyfy AttentionBlock #1492 )

cc @patrickvonplaten, discussed a couple of days ago. Note that there does not seem to be unit tests covering this part, unless I missed them

blefaudeux · 2022-11-30T15:58:40Z

src/diffusers/models/attention.py

        self.attn2._slice_size = slice_size

-    def _set_use_memory_efficient_attention_xformers(self, use_memory_efficient_attention_xformers: bool):
+    def set_use_memory_efficient_attention_xformers(self, use_memory_efficient_attention_xformers: bool):


called from the outside so can be public ? Plus conveys the idea that it's a capability being exposed

HuggingFaceDocBuilderDev · 2022-11-30T15:59:15Z

The documentation is not available anymore as the PR was closed or merged.

blefaudeux · 2022-11-30T16:00:01Z

examples/community/lpw_stable_diffusion.py

            feature_extractor=feature_extractor,
        )

-    def enable_xformers_memory_efficient_attention(self):


here and below: inherits from DiffusionPipeline so I figured that this could be defined there (with the recursive take) to remove a lot of code duplication

blefaudeux · 2022-11-30T16:01:10Z

src/diffusers/models/unet_2d_condition.py

            if hasattr(block, "attentions") and block.attentions is not None:
                block.set_attention_slice(slice_size)

-    def set_use_memory_efficient_attention_xformers(self, use_memory_efficient_attention_xformers: bool):


all these are just trampolines, not needed with the recursive call from the top. An issue with these trampolines is that they're bound to miss some cases (they do) since they would have to be changed any time a new capability is exposed somewhere in the pipeline

blefaudeux · 2022-11-30T16:02:34Z

src/diffusers/pipeline_utils.py

+        """
+        self.set_use_memory_efficient_attention_xformers(False)
+
+    def set_use_memory_efficient_attention_xformers(self, valid: bool) -> None:


the actual single implementation on how to enable mem-efficient attention across the whole model, for all pipelines (covers superres, outpainting or text2img, which mobilize attention in different places at times)

blefaudeux · 2022-11-30T16:06:16Z

src/diffusers/pipeline_utils.py

    def set_progress_bar_config(self, **kwargs):
        self._progress_bar_config = kwargs
+
+    def enable_xformers_memory_efficient_attention(self):


this enable and disable shorthands are just there because many derived pipelines were using that, so I figured that it was cheaper to expose the call here :)

Makes sense to me to make a method of DiffusionPipeline !

blefaudeux · 2022-11-30T16:06:48Z

open for feedback, this is a suggestion of course @patrickvonplaten @kashif

blefaudeux · 2022-11-30T16:21:24Z

I tested

from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    revision="fp16",
    torch_dtype=torch.float16,
).to("cuda")

pipe.enable_xformers_memory_efficient_attention()

with torch.inference_mode():
    sample = pipe("a small cat")

sample[0][0].save("cat.png")

which works fine with this PR

patrickvonplaten

@anton-l @pcuenca @patil-suraj @williamberman what do you think?

patrickvonplaten · 2022-12-01T17:54:43Z

PR looks very nice to me! Given that xformers can essentially be used with every attention layer and every unet pretty much has an attention layer and every pipeline has at least one unet, I think it's a good idea to make it a "global" method by adding it to DiffusionPipeline - what do the others think here?

blefaudeux · 2022-12-01T20:33:34Z

if you check a PR like this one, the changes here make it a lot easier and would remove 2/3rd of the lines of code

patil-suraj

Really cool PR, makes it so much cleaner now ! And agree with you, keeping it in DiffusionPipeline makes sense!

Thanks a lot for working on this!

…ursive (huggingface#1493) * Moving the mem efficiient attention activation to the top + recursive * black, too bad there's no pre-commit ? Co-authored-by: Benjamin Lefaudeux <[email protected]>

Moving the mem efficiient attention activation to the top + recursive

a747d7c

blefaudeux commented Nov 30, 2022

View reviewed changes

black, too bad there's no pre-commit ?

40b3b1e

patrickvonplaten approved these changes Dec 1, 2022

View reviewed changes

blefaudeux mentioned this pull request Dec 1, 2022

Add xformers attention to VAE #1507

Merged

patil-suraj approved these changes Dec 2, 2022

View reviewed changes

patil-suraj merged commit a816a87 into huggingface:main Dec 2, 2022

patil-suraj mentioned this pull request Dec 5, 2022

[refactor] make set_attention_slice recursive #1532

Merged

This comment was marked as spam.

Sign in to view

[refactor] Making the xformers mem-efficient attention activation recursive #1493

[refactor] Making the xformers mem-efficient attention activation recursive #1493

Uh oh!

Conversation

blefaudeux commented Nov 30, 2022

Uh oh!

blefaudeux Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blefaudeux Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

blefaudeux Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

blefaudeux Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

blefaudeux Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Dec 1, 2022

Choose a reason for hiding this comment

Uh oh!

blefaudeux commented Nov 30, 2022

Uh oh!

blefaudeux commented Nov 30, 2022

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Dec 1, 2022

Uh oh!

blefaudeux commented Dec 1, 2022

Uh oh!

patil-suraj left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as spam.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

blefaudeux Nov 30, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Nov 30, 2022 •

edited

Loading

patil-suraj left a comment •

edited

Loading