Skip to content

"MisconfigurationException: No supported gpu backend found!" with multi gpu training in jupyter notebooks #15254

@vacmar01

Description

@vacmar01

Bug description

When trying to train on two GPUs in a jupyter notebooks environment on jarvislabs.ai with ddp_notebooks I get the following error "MisconfigurationException: No supported gpu backend found!".

I'm trying to train on two RTX 5000 GPUs. On a Kaggle GPU the same code runs without any problem.

Any ideas?

How to reproduce the bug

trainer = pl.Trainer(
    max_epochs=2, 
    accelerator="gpu",
    devices=2,
    precision=16,
    accumulate_grad_batches=2
)
trainer.fit(model, train_dl, val_dl)

Error messages and logs

"MisconfigurationException: No supported gpu backend found!"

Environment


#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0): 1.7.7
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 1.10): 1.11
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version: V11.6.55
#- GPU models and configuration: 2x RTX 5000
#- How you installed Lightning(`conda`, `pip`, source): pip
#- Running environment of LightningApp (e.g. local, cloud): jarvislabs.ai 

More info

No response

cc @justusschock @awaelchli

Metadata

Metadata

Assignees

No one assigned

    Labels

    accelerator: cudaCompute Unified Device Architecture GPUbugSomething isn't workinghelp wantedOpen to be worked onrepro neededThe issue is missing a reproducible example

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions