-
Notifications
You must be signed in to change notification settings - Fork 739
Closed
Description
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
- Run
import torch
import torchaudio
device = 'cuda'
to_db = torch.nn.DataParallel(torchaudio.transforms.AmplitudeToDB()).to(device)
x = torch.arange(1).float().to(device)
print(to_db(x))
Expected behavior
When I run the code with single GPU, I get
$ CUDA_VISIBLE_DEVICES=0 python3 debug.py
tensor([-100.], device='cuda:0')
and this is expected behavior but when I run with multi GPUs, I get like below.
$ CUDA_VISIBLE_DEVICES=0,1 python3 debug.py
Traceback (most recent call last):
File "debug.py", line 7, in <module>
print(to_db(x))
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/opt/conda/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
AttributeError: Caught AttributeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1678, in __getattr__
return super(RecursiveScriptModule, self).__getattr__(attr)
File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1499, in __getattr__
return super(ScriptModule, self).__getattr__(attr)
File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
type(self).__name__, name))
AttributeError: 'RecursiveScriptModule' object has no attribute 'forward'
Environment
Collecting environment information...
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1
OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: Could not collect
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB
GPU 4: Tesla V100-SXM2-32GB
GPU 5: Tesla V100-SXM2-32GB
GPU 6: Tesla V100-SXM2-32GB
GPU 7: Tesla V100-SXM2-32GB
Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
Versions of relevant libraries:
[pip] numpy==1.17.4
[pip] pytorch-ignite==0.3.0
[pip] torch==1.4.0
[pip] torchaudio==0.4.0
[pip] torchvision==0.5.0
[conda] blas 1.0 mkl
[conda] mkl 2019.4 243
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.0.15 py37ha843d7b_0
[conda] mkl_random 1.1.0 py37hd6b4f25_0
[conda] pytorch 1.4.0 py3.7_cuda10.1.243_cudnn7.6.3_0 pytorch
[conda] torchvision 0.5.0 py37_cu101 pytorch
Additional context
AmplitudeToDB is based on torch.jit.ScriptModule and maybe that's why the error occurs.
Metadata
Metadata
Assignees
Labels
No labels