-
Notifications
You must be signed in to change notification settings - Fork 6.5k
Fix EMA for multi-gpu training in the unconditional example #1930
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| action="store_true", | ||
| default=True, | ||
| help="Whether to use Exponential Moving Average for the final model weights.", | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also removing the wrong default, as it was done for the text2img example (issue #1654)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small breaking change but think that's fine!
|
The documentation is not available anymore as the PR was closed or merged. |
…fix-unconditional-ema
Co-authored-by: Pedro Cuenca <[email protected]>
patrickvonplaten
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks very nice to me! I'd also just try to make it more or less 100% backwards compatible (let's try to be role-models when it comes to that in OSS).
Think it's not too hard:
- detect if a model is passed
- probs have to keep a mapping of parameters to names to be able to use new logic but also return model as done previously
- we can relatively quickly deprecate here then I think
|
The |
|
The failing tests are unrelated. |
pcuenca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just pointed out a couple nits.
Co-authored-by: Pedro Cuenca <[email protected]>
|
Good to merge for me - thanks @patil-suraj ! |
…ace#1930) * improve EMA * style * one EMA model * quality * fix tests * fix test * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * re organise the unconditional script * backwards compatibility * default to init values for some args * fix ort script * issubclass => isinstance * update state_dict * docstr * doc * Apply suggestions from code review Co-authored-by: Pedro Cuenca <[email protected]> * use .to if device is passed * deprecate device * make flake happy * fix typo Co-authored-by: patil-suraj <[email protected]> Co-authored-by: Pedro Cuenca <[email protected]> Co-authored-by: Patrick von Platen <[email protected]>
Now the unconditional EMA wrapper mimics the
EMAModelfromtrain_text_to_image.py. This isn't a full copy because the unconditional example uses a different decay schedule.Fixes broken training on multiple GPUs: #1772 #1895