Skip to content

LearningRateFinder creates errors for schedulers in val stage #20355

@DeanLa

Description

@DeanLa

Bug description

I have a lightning module which logs the metrics val_loss, and a scheduler that monitors it

def get_plateau_scheduler(self, optimizer):
    plateau_scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min',...)
    scheduler = {'scheduler': plateau_scheduler, 'interval': 'epoch'
                 'monitor': 'val_loss'} # <<<<<
    return scheduler


class MyModel(ParentModule):  # It's a lightning modsule
    def configure_optimizers(self):
        ret = super().configure_optimizers()  # I get the optimizer Here
        ret['lr_scheduler'] = self.get_plateau_scheduler(ret['optimizer'])
        return ret

I also have a list of callback on of them is LearningRateFinder
I run a fit trainer = L.Trainer(logger=logger, callbacks=callbacks, **trainer_args). When the Lr Finder is in the list I get

ReduceLROnPlateau conditioned on metric val_loss which is not available. Available metrics are: ['lr-AdamW', 'train_loss', 'train_loss_step', 'time/backward', 'time/train_batch', 'train_loss_epoch', 'time/train_epoch']. Condition can be set using `monitor` key in lr scheduler dict

When I remove the LR finder, training seems to work well.

What version are you seeing the problem on?

v2.4

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.4.0):
#- PyTorch Version (e.g., 2.4):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):

More info

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageWaiting to be triaged by maintainersver: 2.4.x

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions