[RFC] Deprecate `weights_summary` off the Trainer constructor

## Proposed refactoring or deprecation

- Introduce a new ModelSummary callback, which calls `summarize`: https://github.com/PyTorchLightning/pytorch-lightning/blob/8a931732ae5135e3e55d9c7b7031d81837e5798a/pytorch_lightning/utilities/model_summary.py#L437-L439
- Deprecate `weights_summary` off the Trainer constructor



### Motivation

We are auditing the Lightning components and APIs to assess opportunities for improvements:
- https://github.com/PyTorchLightning/pytorch-lightning/issues/7740
- https://docs.google.com/document/d/1xHU7-iQSpp9KJTjI3As2EM0mfNHHr37WZYpDpwLkivA/edit#

This is a followup to https://github.com/PyTorchLightning/pytorch-lightning/issues/8478 and https://github.com/PyTorchLightning/pytorch-lightning/discussions/9006

Why do we want to remove this from the core trainer logic?
- We need a way for users to customize more of the inputs to the model summary over time without affecting the trainer API. Today, changes to the model summary API also require changes in the core trainer (e.g. the addition of `max_depth` ). This gives model summarization more room to grow without cascading changes elsewhere. 
- Users may want to configure this summarization for different points of execution. Right now, this is hardcoded to be run only during `fit()`. But users could want to call this potentially multiple times during each of `trainer.fit()`, `trainer.validate()`, `trainer.test()` or `trainer.predict()`. 
- Users may want to customize where they save the summary. Right now, it's printed to stdout, but this could also be useful to save to a file or upload to another service for tracking the run.
- The current implementation runs on global rank 0 only in order to avoid printing out multiple summary tables. However, running this on rank 0 will break for model parallel use cases that require communication across ranks. This can lead to subtle failures if `example_input_array` is set as a property on the LightningModule. For instance, a model wrapped with FSDP will break because parameters need to be all-gathered across layers across ranks.
- In case the LightningModule leverages PyTorch LazyModules, users may want to generate this summary only after the first batch is processed in order to get accurate parameter estimations. Estimates of parameter sizes with lazy modules would be misleading.
- AFAICT, this is the only piece of logic that runs in between `on_pretrain_routine_start/end` hooks. Would we still need these hooks if the summarization logic was removed from the trainer? Why doesn't this happen in `on_train_start` today? We don't have `on_prevalidation_routine_start/end` hooks: the necessity of these hooks for training isn't clear to me, and further deprecating these hooks could bring greater API clarity & simplification.
https://github.com/PyTorchLightning/pytorch-lightning/blob/8a931732ae5135e3e55d9c7b7031d81837e5798a/pytorch_lightning/trainer/trainer.py#L1103-L1113


### Pitch

A callback in Lightning naturally fits this extension purpose. It generalizes well across lightning modules, has great flexibility for when it can be called, and allows users to customize the summarization logic (e.g. integrate other libraries more easily). 
- https://github.com/tyleryep/torchinfo
- https://github.com/facebookresearch/fvcore/blob/master/fvcore/nn/flop_count.py

- With this callback available, this logic can be removed from the core Trainer in order to be more pluggable: 
- https://github.com/PyTorchLightning/pytorch-lightning/blob/6604fc1344e1b8a459c45a5a2157aa7fc60d950d/pytorch_lightning/trainer/trainer.py#L1000-L1004 


### Additional context

The model summary is by default enabled right now. This is likely the core issue we have to resolve as to whether this is opt-in or opt-out: https://github.com/PyTorchLightning/pytorch-lightning/issues/8478#issuecomment-884533398

Seeking @edenafek @tchaton 's input on this



______________________________________________________________________

#### If you enjoy Lightning, check out our other projects! ⚡

<sub>

- [**Metrics**](https://github.com/PyTorchLightning/metrics): Machine learning metrics for distributed, scalable PyTorch applications.

- [**Flash**](https://github.com/PyTorchLightning/lightning-flash): The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, finetuning and solving problems with deep learning

- [**Bolts**](https://github.com/PyTorchLightning/lightning-bolts): Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

- [**Lightning Transformers**](https://github.com/PyTorchLightning/lightning-transformers): Flexible interface for high performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Deprecate `weights_summary` off the Trainer constructor #9043

Proposed refactoring or deprecation

Motivation

Pitch

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Deprecate weights_summary off the Trainer constructor #9043

Description

Proposed refactoring or deprecation

Motivation

Pitch

Additional context

If you enjoy Lightning, check out our other projects! ⚡

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Deprecate `weights_summary` off the Trainer constructor #9043