-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Description
The diffusers implementation used by #1583 will automatically use xformers by default, if it is installed.
However, xformers is not pip-installable, so it will be a non-trivial task for our installer to provide it.
Update: xformers 0.16 has been released and now publishes installable wheels to PyPI for Linux and Windows!
https://github.com/facebookresearch/xformers#installing-xformers
xformers provides a significant speedup for nvidia RTX cards with tensor cores, and probably some memory efficiency gains on older CUDA cards.
Not sure if it provides anything at all on non-CUDA platforms.
It's possible to build xformers for some specific subset of CUDA architectures. Making one fat build is probably the easiest thing to manage, but it might also make sense to split "7.5 and newer" vs "older models." I think CUDA 7.5 is the minimum requirement for their flash attention implementation. Hardware support for that began with the nvidia RTX 20xx line and Tesla T4.