Skip to content

ModelCheckpoint misbehaves when no validation #4603

@ferrine

Description

@ferrine

🐛 Bug

ModelCheckpoint misbehaves when there is no validation step. Eg. if you ask to monitor for loss and save every checkpoint with save_top_k=-1 the callback will not do this silently.

To Reproduce

This line of code is a pure indicator of failure without a validation step.

Expected behavior

Callback being called after every epoch

Environment

latest pytorch-lightning

Additional context

I hoped to finetune the model a bit and aggregate the last checkpoints to do SWA. I've fixed the problem using the modified version of ModelCheckpoint

from pytorch_lightning.callbacks.model_checkpoint import ModelCheckpoint as _ModelCheckpoint


class ModelCheckpoint(_ModelCheckpoint):
    def on_epoch_end(self, trainer, pl_module):
        super().on_validation_end(trainer, pl_module)

    def on_validation_end(self, trainer, pl_module):
        pass

I can open a PR if there is no concern about this kind of change

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcheckpointingRelated to checkpointinghelp wantedOpen to be worked onpriority: 0High priority task

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions