-
Notifications
You must be signed in to change notification settings - Fork 744
Description
🐛 Describe the bug
I am trying to reconstruct a waveform by composing the InverseMelScale transform and the GriffinLim transform. The operation hangs while running InverseMelScale. I switched to the librosa version librosa.features.inverse.mel_to_audio which ran without any problems. I have not had any success with the InverseMelScale transform and I have had to kill the process for taking too long.
import librosa
import torch
import torch.nn as nn
import torchaudio
SAMPLE_FILE = 'samples/my_sample.mp3'
waveform, sample_rate = torchaudio.load(SAMPLE_FILE,
num_frames=220_500,
frame_offset=0)
waveform_to_mel_spectrogram = torchaudio.transforms.MelSpectrogram(
sample_rate=sample_rate, n_fft=1024, hop_length=256, n_mels=80)
mel_scale_to_power = torchaudio.transforms.InverseMelScale(
sample_rate=sample_rate, n_stft=1024, n_mels=80)
power_spec_to_waveform = torchaudio.transforms.GriffinLim(n_fft=1024,
hop_length=256)
mel_spectrogram_to_waveform = nn.Sequential(mel_scale_to_power,
power_spec_to_waveform)
mel_spectrogram = waveform_to_mel_spectrogram(waveform)
# ERROR: Hangs here!!!
reconstructed_waveform = mel_spectrogram_to_waveform(mel_spectrogram)
# Runs smoothly with no errors
reconstructed_waveform = librosa.feature.inverse.mel_to_audio(
mel_spectrogram.numpy(), sr=sample_rate, n_fft=1024, hop_length=256)I have tried running the InverseMelScale transform without wrapping it within the nn.Sequential layer and the results are the same.
Versions
PyTorch version: 1.12.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: KDE neon User - 5.25 (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.23.2
Libc version: glibc-2.31
Python version: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-41-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] pytorch-lightning==1.6.5
[pip3] torch==1.12.0
[pip3] torchaudio==0.12.0
[pip3] torchinfo==1.7.0
[pip3] torchmetrics==0.9.3
[pip3] torchvision==0.13.0
[conda] blas 2.115 mkl conda-forge
[conda] blas-devel 3.9.0 15_linux64_mkl conda-forge
[conda] cpuonly 2.0 0 pytorch
[conda] libblas 3.9.0 15_linux64_mkl conda-forge
[conda] libcblas 3.9.0 15_linux64_mkl conda-forge
[conda] liblapack 3.9.0 15_linux64_mkl conda-forge
[conda] liblapacke 3.9.0 15_linux64_mkl conda-forge
[conda] mkl 2022.1.0 h84fe81f_915 conda-forge
[conda] mkl-devel 2022.1.0 ha770c72_916 conda-forge
[conda] mkl-include 2022.1.0 h84fe81f_915 conda-forge
[conda] numpy 1.22.4 py310h4ef5377_0 conda-forge
[conda] pytorch 1.12.0 py3.10_cpu_0 pytorch
[conda] pytorch-lightning 1.6.5 pyhd8ed1ab_0 conda-forge
[conda] pytorch-mutex 1.0 cpu pytorch
[conda] torchaudio 0.12.0 py310_cpu pytorch
[conda] torchinfo 1.7.0 pyhd8ed1ab_0 conda-forge
[conda] torchmetrics 0.9.3 pyhd8ed1ab_0 conda-forge
[conda] torchvision 0.13.0 py310_cpu pytorch