-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Proposed Enhancement
Currently when swa_lrs is not set here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L34-L38
we initialize it to the optimizer lrs here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L167-L168
but during SWALR scheduler update, the values won't be updated because alpha here will be canceled out.
Motivation
If we keep it as it is, it will lead to issues like this: #9453
Pitch
Make swa_lrs in the callback as required since it's a required parameter in SWALR too and don't initialize it with any default.
https://github.com/pytorch/pytorch/blob/bf233aa049c4b479fd6cb19f9b8672bb2d42b0e2/torch/optim/swa_utils.py#L231
Additional context
If you enjoy Lightning, check out our other projects! ⚡
-
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
-
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
-
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
-
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
-
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
cc @Borda @justusschock @awaelchli @akihironitta @rohitgr7 @carmocca