- 
                Notifications
    You must be signed in to change notification settings 
- Fork 736
Closed
Description
- 
All the spectrogram related transforms have default frequency bin n_fft: int = 400, whileMelScaleandInverseMelScaleisn_stft: Optional[int] = None.
- 
In MelScale, whenn_stft=None, it tries to resize the buffer inforward, but this causes TorchScripted (and loaded from file) version fail.
audio/torchaudio/transforms.py
Lines 287 to 308 in 931555c
| specgram (Tensor): A spectrogram STFT of dimension (..., freq, time). | |
| Returns: | |
| Tensor: Mel frequency spectrogram of size (..., ``n_mels``, time). | |
| """ | |
| # pack batch | |
| shape = specgram.size() | |
| specgram = specgram.reshape(-1, shape[-2], shape[-1]) | |
| if self.fb.numel() == 0: | |
| tmp_fb = F.create_fb_matrix(specgram.size(1), self.f_min, self.f_max, | |
| self.n_mels, self.sample_rate, self.norm, | |
| self.mel_scale) | |
| # Attributes cannot be reassigned outside __init__ so workaround | |
| self.fb.resize_(tmp_fb.size()) | |
| self.fb.copy_(tmp_fb) | |
| # (channel, frequency, time).transpose(...) dot (frequency, n_mels) | |
| # -> (channel, time, n_mels).transpose(...) | |
| mel_specgram = torch.matmul(specgram.transpose(1, 2), self.fb).transpose(1, 2) | |
>       return callable(*args, **kwargs)
E       RuntimeError: The following operation failed in the TorchScript interpreter.
E       Traceback of TorchScript, serialized code (most recent call last):
E         File "code/__torch__/torchaudio/transforms.py", line 20, in forward
E           if torch.eq(torch.numel(self.fb), 0):
E             tmp_fb = _0(torch.size(specgram0, 1), 0., 8000., 128, 16000, self.norm, self.mel_scale, )
E             _1 = torch.resize_(self.fb, torch.size(tmp_fb), memory_format=None)
E                  ~~~~~~~~~~~~~ <--- HERE
E             _2 = torch.copy_(self.fb, tmp_fb, False)
E           else:
E       
E       Traceback of TorchScript, original code (most recent call last):
E         File "/root/project/env/lib/python3.9/site-packages/torchaudio-0.9.0a0+bb886e7-py3.9-linux-x86_64.egg/torchaudio/transforms.py", line 302, in forward
E                                               self.mel_scale)
E                   # Attributes cannot be reassigned outside __init__ so workaround
E                   self.fb.resize_(tmp_fb.size())
E                   ~~~~~~~~~~~~~~~ <--- HERE
E                   self.fb.copy_(tmp_fb)
E           
E       RuntimeError: Trying to resize storage that is not resizable at /opt/conda/conda-bld/pytorch_1617951974812/work/aten/src/TH/THStorageFunctions.cpp:87
To reproduce,
- construct MelScalewithn_stft=None.
- Script the transform and save on file
- Load the transform from file and feed a spectrogram Tensor.
Once the transform is scripted and dumped, there is no way to fix this issue.
The library code should not be hacking around, which can generate such a stack state.
For fix, since all the n_fft defaults to 400, n_stft should default to 201 as well.
This will remove the need of the above resize_ hack.
Metadata
Metadata
Assignees
Labels
No labels