Consolidate collective functions

## 🚀 Feature
Lightning should offer a central place to use the collective functions provided here: https://pytorch.org/docs/stable/distributed.html#collective-functions

### Motivation

LightningModule code is usually agnostic to what device its running on or whether its running in a distributed training environment. However, there are times where the module does need to rely on collective functions. 

In Lightning, we currently have many places where these are offered:
- On this distributed object, which only supports `broadcast`: https://github.com/PyTorchLightning/pytorch-lightning/blob/master/pytorch_lightning/distributed/dist.py

- `reduce`, `barrier`, `broadcast`, `all_gather`, and `reduce_boolean_decision` are on the trainer's accelerator and training type plugin: 
- https://github.com/PyTorchLightning/pytorch-lightning/blob/233f252bb427c930be8e7ca56fe115b637278b8d/pytorch_lightning/accelerators/accelerator.py#L431-L455
https://github.com/PyTorchLightning/pytorch-lightning/blob/233f252bb427c930be8e7ca56fe115b637278b8d/pytorch_lightning/plugins/training_type/training_type_plugin.py#L78-L103

- more utilities for gathering tensors, all_gather, and sync_ddp here: https://github.com/PyTorchLightning/pytorch-lightning/blob/b9a52fa2ef31f12f6992ece18a033318ec551907/pytorch_lightning/utilities/distributed.py#L86-L217

- `all_gather` repeated again here on the lightning module, calling the trainer's accelerator functions: https://github.com/PyTorchLightning/pytorch-lightning/blob/233f252bb427c930be8e7ca56fe115b637278b8d/pytorch_lightning/core/lightning.py#L506-L532

Some of these call each other and the dependency isn't very clear now, so it is confusing for users which to go through.

### Pitch

1. Offer these utilities under a central place: `pytorch_lightning/utilities/collectives.py` for these utilities:
`barrier`, `all_gather`, `broadcast`, etc

These should be very thin wrappers over the PyTorch distributed functions, checking if torch.distributed is available and initialized. If not, we return what's expected for single-process training.

2. Update the callsites internally to use to these implementations

3. Mark existing functions as deprecated and slated for removal in v1.6


cc @borda @awaelchli @rohitgr7 @akihironitta @justusschock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consolidate collective functions #7534

🚀 Feature

Motivation

Pitch

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consolidate collective functions #7534

Description

🚀 Feature

Motivation

Pitch

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions