-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
Proposed refactoring or deprecation
Now that the checkpoint is better consolidated in the training type plugin, we no longer need this property, as this becomes an internal implementation detail of the training type.
Users (via the checkpoint callback) only need to call trainer.save_checkpoint and assume the plugin will handle checks surrounding this for them
Motivation
- Ensure consolidation of saving logic in one place (e.g. all fsspec code for checkpoint paths shoud sit in one place vs. being scattered around the codebase)
- API simpliication: fewer properties exposed on the Trainer
Pitch
Deprecate this property: https://github.com/PyTorchLightning/pytorch-lightning/blob/538e743f17c7da4624c902f762922e2837661818/pytorch_lightning/trainer/properties.py#L117-L120
Deprecate this property: https://github.com/PyTorchLightning/pytorch-lightning/blob/538e743f17c7da4624c902f762922e2837661818/pytorch_lightning/plugins/training_type/training_type_plugin.py#L313-L316
Move the directory creation from here: https://github.com/PyTorchLightning/pytorch-lightning/blob/538e743f17c7da4624c902f762922e2837661818/pytorch_lightning/callbacks/model_checkpoint.py#L511-L514
@SeanNaren - this is where we should also have a rm_checkpoint on the Checkpoint IO plugin such that we can deprecate this from the model checkpoint: https://github.com/PyTorchLightning/pytorch-lightning/blob/538e743f17c7da4624c902f762922e2837661818/pytorch_lightning/callbacks/model_checkpoint.py#L502-L505
One thing I'm not sure about: because the should_rank_save_checkpoint is exposed from the accelerator to the trainer as part of the public trainer API, does this need to go through a full deprecation cycle? Or is a breaking change as part of the plugins API permissible?
Additional context
If you enjoy Lightning, check out our other projects! ⚡
-
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
-
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning
-
Bolts: Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch
-
Lightning Transformers: Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.