CUDAAccelerator.num_cuda_devices() returns 0 while torch.cuda.device_count() returns 1

### First check

- [X] I'm sure this is a bug.
- [X] I've added a descriptive title to this bug.
- [X] I've provided clear instructions on how to reproduce the bug.
- [X] I've added a code sample.
- [X] I've provided any other important info that is required.

### Bug description

`CUDAAccelerator.num_cuda_devices()` returns `0` while `torch.cuda.device_count()` returns `1`. This causes the `Trainer(accelerator="cuda", devices=1, ...)` to get an error: 
```
.../lib/python3.9/site-packages/torch/cuda/__init__.py:83: UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at  ../c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0
```
...
```
MisconfigurationException: CUDAAccelerator can not run on your system since the accelerator is not available. The following accelerator(s) is available and can be passed into `accelerator` argument of `Trainer`: ['cpu'].
```

### How to reproduce the bug

```python
cuda_acc = CUDAAccelerator()
cuda_acc.auto_device_count() # gets 0

and

Trainer(accelerator="cuda", devices=1, ...) # gets error like above
```


### Error messages and logs

```
.../lib/python3.9/site-packages/torch/cuda/__init__.py:83: UserWarning: CUDA initialization: CUDA driver initialization failed, you might not have a CUDA gpu. (Triggered internally at  ../c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0
```

and:
```
...
    527 if not self.accelerator.is_available():
    528     available_accelerator = [
    529         acc_str for acc_str in self._accelerator_types if AcceleratorRegistry.get(acc_str).is_available()
    530     ]
--> 531     raise MisconfigurationException(
    532         f"{self.accelerator.__class__.__qualname__} can not run on your system"
    533         " since the accelerator is not available. The following accelerator(s)"
    534         " is available and can be passed into `accelerator` argument of"
    535         f" `Trainer`: {available_accelerator}."
    536     )
    538 self._set_devices_flag_if_auto_passed()
    540 self._gpus = self._devices_flag if not self._gpus else self._gpus

MisconfigurationException: CUDAAccelerator can not run on your system since the accelerator is not available. The following accelerator(s) is available and can be passed into `accelerator` argument of `Trainer`: ['cpu'].
```

### Important info

```

#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow): Trainer, CUDAAccelerator
#- PyTorch Lightning Version (e.g., 1.5.0): 1.7.7
#- PyTorch Version (e.g., 1.10): 1.12.1+cu116
#- Python version (e.g., 3.9): 3.9
#- OS (e.g., Linux): Ubuntu 18.04
#- NVIDIA version: 515.65.01
#- CUDA version: 11.7
#- cuDNN version: 8.5.0.96.1+cuda11.7
#- GPU models and configuration: NVIDIA GeForce RTX 3090 (Similar problem detected on same system but with NVIDIA GeForce GTX 1080 Ti)
#- How you installed Lightning(`conda`, `pip`, source): pip
#- Running environment: local

```


### More info

I did dig around and what I found was that this function at .../lib/python3.9/site-packages/pytorch_lightning/utilities/device_parser.py line 339:
```
def num_cuda_devices() -> int:
    """Returns the number of GPUs available.

    Unlike :func:`torch.cuda.device_count`, this function will do its best not to create a CUDA context for fork
    support, if the platform allows it.
    """
    if "fork" not in torch.multiprocessing.get_all_start_methods() or _is_forking_disabled():
        return torch.cuda.device_count()
    with multiprocessing.get_context("fork").Pool(1) as pool:
        return pool.apply(torch.cuda.device_count)
```
is the culprit. The if-statement returns false and apparently the multiprocessing code returns the incorrect answer. However, if I add this line `torch.cuda.device_count()` after the if-statement like this:
```
def num_cuda_devices() -> int:
    """Returns the number of GPUs available.

    Unlike :func:`torch.cuda.device_count`, this function will do its best not to create a CUDA context for fork
    support, if the platform allows it.
    """
    if "fork" not in torch.multiprocessing.get_all_start_methods() or _is_forking_disabled():
        return torch.cuda.device_count()
    torch.cuda.device_count()
    with multiprocessing.get_context("fork").Pool(1) as pool:
        return pool.apply(torch.cuda.device_count)
```
, then everything works correctly. I think it was some problem with the multiprocessing code forking the main process -> child process and the child process for some reason does not have access to cuda. Please take a look at this problem because while I found a work around, it may not be what's intended at the time when the code was written. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CUDAAccelerator.num_cuda_devices() returns 0 while torch.cuda.device_count() returns 1 #14858

First check

Bug description

How to reproduce the bug

Error messages and logs

Important info

More info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CUDAAccelerator.num_cuda_devices() returns 0 while torch.cuda.device_count() returns 1 #14858

Description

First check

Bug description

How to reproduce the bug

Error messages and logs

Important info

More info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions