Is your feature request related to a problem? Please describe.
Textual inversion requires a lot of (V)RAM and could possibly benefit from attention splicing.
Describe the solution you'd like
Implement attention splicing for textual inversion
Describe alternatives you've considered
I tried to use unet.set_attention_slice(unet.config.attention_head_dim // 2), but it didn't seem to do much. I am not sure if I need to enable it at more places or if it just is not efficient for the training pipeline.