Transfer learning phases

## 🚀 Feature
When doing transfer learning we need to switch between **phases**. 

Normally, the first phase is to freeze all but the head of the model and train only that. 

After a predefined amount of epochs, we unfreeze the rest of our model (or a part of it) and start training again (possibly with the help of differential learning rates, described in #2005). We can repeat this phase as many times as we like.

We should implement a class that handles all of that for us, this includes:
* Unfreeze part of our model
* Reset and change the `lr_scheduler` parameters between phases
* If `LearningRateLogger` is being used, register the new `lr_scheduler` 

#2005 Will take care of the parameter groups
This will take care of what I call "phase switches"

## Proposals
There are some ways of achieving this:

### Logic inside `on_epoch_start`
```python
def on_epoch_start(self):
    if self.current_epoch == 0:
        self.freeze()
        self.trainer.lr_schedulers = ... # Define new scheduler
        
    if self.current_epoch == N_FREEZE_EPOCHS:
        self.unfreeze() # Or partially unfreeze
        self.trainer.lr_schedulers = ... # Define new scheduler
```
We can keep adding as many milestones as we want this way, but it's important to note that **they all have to be define beforehand**.

### Multiple calls to `Trainer.fit`
```python
model.freeze()
trainer.fit_one_cycle(model, n_epochs=2, lr=1e-3, pct_start=0.9)
model.unfreeze()
trainer.fit_one_cycle(mode, n_epochs=5, lr=slice(5e-6, 5e-4), pct_start=0.2)
```
This is exactly the flow on fastai, this way of training model is excellent for iterative training, like on a notebook or a REPL.

fit_one_cycle assumes that we are using the OneCycleLR scheduler, assumes that each call is a continuation of the last, and assumes we want to reset our schedule

When we pass a slice to lr we are asking for a interpolation of values between the trainable layer groups

## Implement a new scheduler (suggested by @williamFalcon)
The scheduler receives a list of dicts, each dict will specify the duration of the phase and it's configuration (what layers to freeze, what lrs to use, ...)
```python
scheduler = FineTuneScheduler([
   {'params': [nn.Sequential(self.c_d1, self.c_d1_bn), self.c_d2], 'action': 'freeze', 'epoch': 0},
   {'params': [self.c_d2], 'action': 'unfreeze', 'epoch': 2},
])
```
Then we can just pass the scheduler to the `Trainer`.

## Notes
In both cases, the flow should be the same for all standard areas (vision, nlp, time-series,...). 

The only things we assume is:
* You want to train on model in multiple phases
* The phases are a continuation of each other 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transfer learning phases #2006

🚀 Feature

Proposals

Logic inside `on_epoch_start`

Multiple calls to `Trainer.fit`

Implement a new scheduler (suggested by @williamFalcon)

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Transfer learning phases #2006

Description

🚀 Feature

Proposals

Logic inside on_epoch_start

Multiple calls to Trainer.fit

Implement a new scheduler (suggested by @williamFalcon)

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Logic inside `on_epoch_start`

Multiple calls to `Trainer.fit`