DDP does not clean up the processes it makes

## 🐛 Bug

After finishing running my `trainer.fit(...)` and `trainer.test(...)`, the program should exit normally. But instead, I get a resource warning that some resources are still in use.

### To Reproduce

Run a script using `accelerator="ddp"` and run with `PYTHONTRACEMALLOC=1 python <script_name>.py --gpus 4`. 

### Expected behavior

The resources should be freed or the processes ended gracefully.

### Environment

PyTorch version: 1.8.1
Is debug build: False
CUDA used to build PyTorch: 11.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.2 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Clang version: Could not collect
CMake version: version 3.16.3

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 11.1.105
GPU models and configuration: 
GPU 0: Quadro RTX 8000
GPU 1: Quadro RTX 8000
GPU 2: Quadro RTX 8000
GPU 3: Quadro RTX 8000

Nvidia driver version: 460.39
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] pytorch-lightning==1.3.0rc1
[pip3] torch==1.8.1
[pip3] torchmetrics==0.3.0rc0
[pip3] torchvision==0.9.1
[pip3] vit-pytorch==0.6.7
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.1.1               h6406543_8    conda-forge
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2020.4             h726a3e6_304    conda-forge
[conda] mkl-service               2.3.0            py38h1e0a361_2    conda-forge
[conda] mkl_fft                   1.3.0            py38h5c078b8_1    conda-forge
[conda] mkl_random                1.2.0            py38hc5bc63f_1    conda-forge
[conda] numpy                     1.19.2           py38h54aff64_0  
[conda] numpy-base                1.19.2           py38hfa32c7d_0  
[conda] pytorch                   1.8.1           py3.8_cuda11.1_cudnn8.0.5_0    pytorch
[conda] pytorch-lightning         1.3.0rc1                 pypi_0    pypi
[conda] torchmetrics              0.3.0rc0                 pypi_0    pypi
[conda] torchvision               0.9.1                py38_cu111    pytorch
[conda] vit-pytorch               0.6.7                    pypi_0    pypi

### Additional context

The traceback:

```
/home/ndalton/miniconda3/envs/cvos/lib/python3.8/subprocess.py:942: ResourceWarning: subprocess 3670473 is still running
  _warn("subprocess %s is still running" % self.pid,
Object allocated at (most recent call last):
  File "/home/ndalton/miniconda3/envs/cvos/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/ddp.py", lineno 173
    proc = subprocess.Popen(command, env=env_copy, cwd=cwd)
```

Which is repeated for every GPU - so if gpus=4 is set, then this traceback will be shown 4 times, each with different PID of course.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DDP does not clean up the processes it makes #6994

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DDP does not clean up the processes it makes #6994

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions