Skip to content

Make swa_lrs as required inside SWACallback #11822

@rohitgr7

Description

@rohitgr7

Proposed Enhancement

Currently when swa_lrs is not set here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L34-L38

we initialize it to the optimizer lrs here:
https://github.com/PyTorchLightning/pytorch-lightning/blob/e3820da28a0cd0982dd1c65d7da1a0e2180454c1/pytorch_lightning/callbacks/stochastic_weight_avg.py#L167-L168

but during SWALR scheduler update, the values won't be updated because alpha here will be canceled out.

https://github.com/pytorch/pytorch/blob/bf233aa049c4b479fd6cb19f9b8672bb2d42b0e2/torch/optim/swa_utils.py#L281-L286

Motivation

If we keep it as it is, it will lead to issues like this: #9453

Pitch

Make swa_lrs in the callback as required since it's a required parameter in SWALR too and don't initialize it with any default.
https://github.com/pytorch/pytorch/blob/bf233aa049c4b479fd6cb19f9b8672bb2d42b0e2/torch/optim/swa_utils.py#L231

Additional context


If you enjoy Lightning, check out our other projects! ⚡

  • Metrics: Machine learning metrics for distributed, scalable PyTorch applications.

  • Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.

  • Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.

  • Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.

  • Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

cc @Borda @justusschock @awaelchli @akihironitta @rohitgr7 @carmocca

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions