-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Labels
accelerator: cudaCompute Unified Device Architecture GPUCompute Unified Device Architecture GPUbugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked onrepro neededThe issue is missing a reproducible exampleThe issue is missing a reproducible example
Description
Bug description
When trying to train on two GPUs in a jupyter notebooks environment on jarvislabs.ai with ddp_notebooks I get the following error "MisconfigurationException: No supported gpu backend found!".
I'm trying to train on two RTX 5000 GPUs. On a Kaggle GPU the same code runs without any problem.
Any ideas?
How to reproduce the bug
trainer = pl.Trainer(
max_epochs=2,
accelerator="gpu",
devices=2,
precision=16,
accumulate_grad_batches=2
)
trainer.fit(model, train_dl, val_dl)Error messages and logs
"MisconfigurationException: No supported gpu backend found!"
Environment
#- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):
#- PyTorch Lightning Version (e.g., 1.5.0): 1.7.7
#- Lightning App Version (e.g., 0.5.2):
#- PyTorch Version (e.g., 1.10): 1.11
#- Python version (e.g., 3.9):
#- OS (e.g., Linux):
#- CUDA/cuDNN version: V11.6.55
#- GPU models and configuration: 2x RTX 5000
#- How you installed Lightning(`conda`, `pip`, source): pip
#- Running environment of LightningApp (e.g. local, cloud): jarvislabs.ai
More info
No response
debvrat
Metadata
Metadata
Assignees
Labels
accelerator: cudaCompute Unified Device Architecture GPUCompute Unified Device Architecture GPUbugSomething isn't workingSomething isn't workinghelp wantedOpen to be worked onOpen to be worked onrepro neededThe issue is missing a reproducible exampleThe issue is missing a reproducible example