Hang when using Lightning CLI from config file and DDP

## 🐛 Bug


When I use Lightning CLI to instantiate training from a YAML config file with DDP, I got the error below and the process will hang.

> ```RuntimeError: SaveConfigCallback expected xxxx/lightning_logs/version_5/config.yaml to NOT exist. Aborting to avoid overwriting results of a previous run. You can delete the previous config file, set `LightningCLI(save_config_callback=None)` to disable config saving, or set `LightningCLI(save_config_overwrite=True)` to overwrite the config file.```

Ctrl + C will not work. I have to kill the process manually.

I suspect this is because each DDP process tries to write the same config file to disk. Passing either `save_config_callback=None` or `save_config_overwrite=True` like suggested in the error message solves the issue.

### To Reproduce



```python3
# bug.py
import torch
from torch.utils.data import Dataset, DataLoader

from pytorch_lightning import LightningModule
from pytorch_lightning.utilities.cli import LightningCLI

class RandomDataset(Dataset):
    def __init__(self, size, num_samples):
        self.len = num_samples
        self.data = torch.randn(num_samples, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len

class BoringModel(LightningModule):
    def __init__(self):
        super().__init__()
        self.layer = torch.nn.Linear(32, 2)

    def forward(self, x):
        return self.layer(x)

    def train_dataloader(self):
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def val_dataloader(self):
        return DataLoader(RandomDataset(32, 64), batch_size=2)

    def training_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("train_loss", loss)
        return {"loss": loss}

    def validation_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("valid_loss", loss)

    def test_step(self, batch, batch_idx):
        loss = self(batch).sum()
        self.log("test_loss", loss)

    def configure_optimizers(self):
        return torch.optim.SGD(self.layer.parameters(), lr=0.1)

if __name__ == "__main__":
    LightningCLI(BoringModel)
```

```yaml
# bug.yaml
trainer:
  gpus: 2
  strategy: ddp
  max_epochs: 10
```

Run Lightning CLI
```bash
python bug.py fit --config bug.yaml
```

### Expected behavior


There should be no error by default, without explicitly passing `save_config_callback=None` or `save_config_overwrite=True` to Lightning CLI.

### Environment



* CUDA:
        - GPU:
                - GeForce RTX 3090
                - GeForce RTX 3090
                - GeForce RTX 3090
                - GeForce RTX 3090
        - available:         True
        - version:           11.3
* Packages:
        - numpy:             1.21.2
        - pyTorch_debug:     False
        - pyTorch_version:   1.10.0
        - pytorch-lightning: 1.5.5
        - tqdm:              4.62.3
* System:
        - OS:                Linux
        - architecture:
                - 64bit
                - ELF
        - processor:         x86_64
        - python:            3.8.12
        - version:           #86~18.04.1-Ubuntu SMP Fri Jun 18 01:23:22 UTC 2021

### Additional context




cc @carmocca @mauvilsa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hang when using Lightning CLI from config file and DDP #11158

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hang when using Lightning CLI from config file and DDP #11158

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions