-
Notifications
You must be signed in to change notification settings - Fork 617
Decoupled weight decay #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decoupled weight decay #164
Conversation
|
@PhilJd Welcome, thanks for the PR, and your patience. I'll take a look later this week :-) |
|
You probably need to add this into |
|
Thanks! I've added the respective classes and functions to |
facaiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, I'll take another look at weekend :-)
| self.evaluate(repeated_index_update_var)) | ||
|
|
||
|
|
||
| if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need it?
|
Thanks for the comments @facaiy, I've updated the PR! |
facaiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Leave some questions for tf 2.0
|
@PhilJd Phil, don't forget to run |
|
@PhilJd Hi, Phil. Could you address all comments? Thank you for the high quality PR, can't wait to merge it :-) |
…izer tests, optimizer params are now keywords instead of a dict. Fix code in comments to support tf-2.0, naming errors, line length.
|
@facaiy Thanks a lot for the comments! :)
By the way, I ran |
facaiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very close :-)
factory function.
|
Oops, I forgot to commit |
|
Looks great, thank you, PhilJ! Could you resolve the merge conflict with master branch? By the way, you can put your name in the contact info list and code owner file if you'd like to maintain the module contributed by yourself. |
|
@seanpmorgan @WindQAQ Sean, Tzu-Wei, do you have any concerns about this change? |
No, looks like a very nice PR.. Just needs to resolve conflicts and test IMO |
|
Conflicts with the master are resolved and I've put my name as maintainer into the README ;) |
|
Interesting, running |
|
Please ping me when you get it done, and I'll merge it. Thanks for your patience :-) |
|
I've applied the patch from the build server. I wasn't able to find out why the code-format doesn't complain locally. I've tried the clang format version the build task uses, |
facaiy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This PR ports the decoupled weight decay optimizers (SGDW, AdamW) to tensorflow 2.0, minus all v1 tests, as tensorflow_addons depends on tf-2.0 anyway.
Note that I factored out the testing code that is duplicated in most optimizer tests (also in base tensorflow) into
optimizer_test_base. After this PR has been merged, I'd adapt the LazyAdam optimizer to inherit fromOptimizerTestBase.Sorry for the long delay!
Cheers,
Phil :)
Closes #24.