-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
From keturn in #1583 (comment)
@damian0815 I saw you pushed a fix for the cross-attention crash. Confirmed that it works here, thanks!
Unfortunately it only works if both xformers_memory_efficient_attention and attention_slicing are disabled. If either is enabled, it doesn't crash, but it silently fails to do its thing and just results in rendering the replacement prompt.
How would you like to handle this? Should setup_cross_attention_control disable those things? Or should we move on to using the 0.12.dev diffusers API first, and figure out something based on that? I'm looking forward to that in any case so we can get rid of the monkeypatch.
From damian in #1583 (comment)
@keturn so the monkey-patched
InvokeAIDiffusersCrossAttentiondoes its own slicing internally, using a technique with better performance than the diffusersattention_slicingfunctionality. if attention_slicing gets turned on and the user wants to do cross-attention control, maybe they should get a warning?as for the
xformers_memory_efficient_attention, yeah it's unsupported for now. this comment links to an xformers-compatible implementation that the author claims would be easy to wrap into aCrossAttentionProcessor- for reference the diffusers update to allow customCrossAttentionProcessors (huggingface/diffusers@4125756) was merged to diffusersmainafter the v0.11 tag, so it won't appear until v0.12. my thinking here is, it would be prudent to get the dev/diffusers branch into main before doing extra work enabling xformers - since xformers is currently unsupported in main, it's an extra concern for this PR that ought to be shuffled into the future.fwiw patrick actually posted what looks to be a better (albeit pseudocode) implementation of cross-attention control than mine in the PR description.
from keturn in #1583 (comment)
Hmm, okay, I can respect a "current functionality before optimizations and new stuff" argument against scope creep. But with some concerns:
- for several versions of diffusers, it had xformers enabled by default when available. Having swap silently not-work in those cases is bad. I'd say at a minimum we need to:
- replace the current "if available: enable" code in our initialization code to always explicitly disable the incompatible features.
- some kind of assertion that can detect whether the right thing was called. i.e. if it goes to apply a cross-attention control but no attention was ever wrangled, there's a problem we need to detect at some point. Whether that's an integration test or a runtime assertion.
- "monkeypatch third-party classes" and "use a not-yet-released API" are both bad options for something we want to merge to main. Between the two, I think going with the not-yet-released API is better in this case. It won't be unreleased for long, and since it is being added specifically for our needs here, it'd be good to try out and see it needs any revisions.
Are you up for giving that 0.12.dev API a try? Or would you like me to take a crack at trying to translate the pseudo-code you mentioned to something that'll work with what we've got here?