-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
featureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onwon't fixThis will not be worked onThis will not be worked on
Description
🚀 Feature
Implement a class (possibly a ModuleMixin) that makes it easy for the user to define parameter groups (PGs) for his module.
Once we have the PGs we can start semi-automatically handling the differential learning rates (being left to the user to define it's values)
Motivation
Improve the transfer learning workflow
Pitch
The usage of differential learning rates is essential when working with transfer learning
Additional context
One implemented, differential learning rates should look something like this:

In the above image each parameter group follows the OneCycleLR schedule, but with different max_lr. Earlier layers train with a lower learning rate compared to new ones
Borda, xeTaiz and franklinz233
Metadata
Metadata
Assignees
Labels
featureIs an improvement or enhancementIs an improvement or enhancementhelp wantedOpen to be worked onOpen to be worked onwon't fixThis will not be worked onThis will not be worked on