Skip to content

Differential learning rates and parameter groups #2005

@lgvaz

Description

@lgvaz

🚀 Feature

Implement a class (possibly a ModuleMixin) that makes it easy for the user to define parameter groups (PGs) for his module.
Once we have the PGs we can start semi-automatically handling the differential learning rates (being left to the user to define it's values)

Motivation

Improve the transfer learning workflow

Pitch

The usage of differential learning rates is essential when working with transfer learning

Additional context

One implemented, differential learning rates should look something like this:
image

In the above image each parameter group follows the OneCycleLR schedule, but with different max_lr. Earlier layers train with a lower learning rate compared to new ones

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureIs an improvement or enhancementhelp wantedOpen to be worked onwon't fixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions