OOM issues with loading large model checkpoints w/ FSDP after checkpoint refactor

## 🐛 Bug

In #7928 the trainer logic was modified to restore the model state from the checkpoint connector instead of from the training type plugin and `restore_model_from_ckpt_path` was split into three new modular APIs. For our use case we overrode `restore_model_from_ckpt_path` in the FSDP plugin to prevent CPU OOMs, and now that the functionality for restoring the model state has been offloaded to the checkpoint, we run into OOMs again.

In #7509 it was proposed to solve this problem on the level of `trainer` — comment suggests offloading responsibility to `training_type_plugin` since this is not widely required outside of DDP and its derivatives, but restoring model state functionality no longer belongs to the plugin. Could we add some more memory-friendly logic to the checkpoint connector in case of multiple workers? 

## Please reproduce using the BoringModel




### To Reproduce

Use following [**BoringModel**](https://colab.research.google.com/drive/1HvWVVTK8j2Nj52qU4Q4YCyzOm0_aLQF3?usp=sharing) and post here



### Expected behavior



### Environment

**Note**: `Bugs with code` are solved faster ! `Colab Notebook` should be made `public` !

* `IDE`: Please, use our python [bug_report_model.py](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pl_examples/bug_report_model.py
) template.

* `Colab Notebook`: Please copy and paste the output from our [environment collection script](https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py) (or fill out the checklist below manually).

You can get the script and run it with:
```
wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py
```

 - PyTorch Version (e.g., 1.0):
 - OS (e.g., Linux):
 - How you installed PyTorch (`conda`, `pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:

### Additional context

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OOM issues with loading large model checkpoints w/ FSDP after checkpoint refactor #8043

🐛 Bug

Please reproduce using the BoringModel

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OOM issues with loading large model checkpoints w/ FSDP after checkpoint refactor #8043

Description

🐛 Bug

Please reproduce using the BoringModel

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions