-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingdistributedGeneric distributed-related topicGeneric distributed-related topicgood first issueGood for newcomersGood for newcomershelp wantedOpen to be worked onOpen to be worked onpriority: 0High priority taskHigh priority task
Description
🐛 Bug
When using multiple GPUs with 'dp', the error RuntimeError: grad can be implicitly created only for scalar outputs occurs if I utilized training_step function like this:
def training_step(self, batch, batch_idx):
...
return {'loss': loss}Please reproduce using the BoringModel
https://colab.research.google.com/drive/1hmHqYHPOqDlZUAF7-9zcCvobrvSPt7W5?usp=sharing
Expected behavior
It is supposed to work fine to return Dict with loss key.
A quick solution
Return loss tensor directly from training_step function:
def training_step(self, batch, batch_idx):
...
return lossEnvironment
- PyTorchLightning Version: 1.2.0
- PyTorch Version: 1.7.0
- OS: Linux
- Python version: 3.8
- CUDA/cuDNN version: 10.2
cc. @carmocca
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingdistributedGeneric distributed-related topicGeneric distributed-related topicgood first issueGood for newcomersGood for newcomershelp wantedOpen to be worked onOpen to be worked onpriority: 0High priority taskHigh priority task