Skip to content
This repository was archived by the owner on Feb 7, 2025. It is now read-only.

Conversation

@Warvito
Copy link
Collaborator

@Warvito Warvito commented Dec 16, 2022

Signed-off-by: Walter Hugo Lopez Pinaya [email protected]

Signed-off-by: Walter Hugo Lopez Pinaya <[email protected]>
@Warvito Warvito linked an issue Dec 16, 2022 that may be closed by this pull request
@Warvito
Copy link
Collaborator Author

Warvito commented Dec 16, 2022

Waiting for version 0.0.16 from xformers
facebookresearch/xformers#533 (comment)

@Warvito
Copy link
Collaborator Author

Warvito commented Dec 17, 2022

Signed-off-by: Walter Hugo Lopez Pinaya <[email protected]>
@Warvito
Copy link
Collaborator Author

Warvito commented Dec 23, 2022

On my TITAN RTX, the new attention layers are making the 2D ddpm tutorial consume 20Gb, 33 sec per training epoch and 15 sec to sample 1 image. When using xformers, it consume a little less memory 18Gb, 38 sec per training epoch and 10 sec to sample 1 image. When I tested the autoencoderKL, it had no significant difference between with and without xformers. I will try with an A100 yet

Signed-off-by: Walter Hugo Lopez Pinaya <[email protected]>
@Warvito
Copy link
Collaborator Author

Warvito commented Dec 26, 2022

On a A100, the 2D DDPM tutorial takes 15-16 s per training epoch, 19 Gb of memory, and 8 s to sample. With xformers it takes 18-19s per training epoch, 16.4Gb of memory, and 8 s to generate 1 sample

…attention-mechanisms

# Conflicts:
#	generative/networks/nets/diffusion_model_unet.py
Signed-off-by: Walter Hugo Lopez Pinaya <[email protected]>
Signed-off-by: Walter Hugo Lopez Pinaya <[email protected]>
@Warvito Warvito marked this pull request as ready for review January 6, 2023 13:59
@Warvito Warvito changed the title [WIP] Optimise Attention Mechanisms Optimise Attention Mechanisms Jan 6, 2023
Copy link
Contributor

@danieltudosiu danieltudosiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request is good. But we should try not to create so much duplicate code as in this PR. If time allows it please try and create an attention utils or something similar and aggregate reusable methods there.

@Warvito Warvito merged commit 1b34291 into main Jan 14, 2023
@Warvito Warvito deleted the 135-should-we-add-xformers-efficient-memory-attention-mechanisms branch January 14, 2023 14:39
@Warvito Warvito mentioned this pull request Jan 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should we add xformers efficient memory attention mechanisms?

3 participants