Skip to content

torchaudio.transforms.InverseMelScale does not work #2594

@Kinyugo

Description

@Kinyugo

🐛 Describe the bug

I am trying to reconstruct a waveform by composing the InverseMelScale transform and the GriffinLim transform. The operation hangs while running InverseMelScale. I switched to the librosa version librosa.features.inverse.mel_to_audio which ran without any problems. I have not had any success with the InverseMelScale transform and I have had to kill the process for taking too long.

import librosa
import torch
import torch.nn as nn
import torchaudio

SAMPLE_FILE = 'samples/my_sample.mp3'
waveform, sample_rate = torchaudio.load(SAMPLE_FILE,
                                        num_frames=220_500,
                                        frame_offset=0)

waveform_to_mel_spectrogram = torchaudio.transforms.MelSpectrogram(
    sample_rate=sample_rate, n_fft=1024, hop_length=256, n_mels=80)

mel_scale_to_power = torchaudio.transforms.InverseMelScale(
    sample_rate=sample_rate, n_stft=1024, n_mels=80)
power_spec_to_waveform = torchaudio.transforms.GriffinLim(n_fft=1024,
                                                          hop_length=256)
mel_spectrogram_to_waveform = nn.Sequential(mel_scale_to_power,
                                            power_spec_to_waveform)

mel_spectrogram = waveform_to_mel_spectrogram(waveform)

# ERROR: Hangs here!!!
reconstructed_waveform = mel_spectrogram_to_waveform(mel_spectrogram)

# Runs smoothly with no errors
reconstructed_waveform = librosa.feature.inverse.mel_to_audio(
    mel_spectrogram.numpy(), sr=sample_rate, n_fft=1024, hop_length=256)

I have tried running the InverseMelScale transform without wrapping it within the nn.Sequential layer and the results are the same.

Versions

PyTorch version: 1.12.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: KDE neon User - 5.25 (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Clang version: Could not collect
CMake version: version 3.23.2
Libc version: glibc-2.31

Python version: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] (64-bit runtime)
Python platform: Linux-5.15.0-41-generic-x86_64-with-glibc2.31
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.22.4
[pip3] pytorch-lightning==1.6.5
[pip3] torch==1.12.0
[pip3] torchaudio==0.12.0
[pip3] torchinfo==1.7.0
[pip3] torchmetrics==0.9.3
[pip3] torchvision==0.13.0
[conda] blas 2.115 mkl conda-forge
[conda] blas-devel 3.9.0 15_linux64_mkl conda-forge
[conda] cpuonly 2.0 0 pytorch
[conda] libblas 3.9.0 15_linux64_mkl conda-forge
[conda] libcblas 3.9.0 15_linux64_mkl conda-forge
[conda] liblapack 3.9.0 15_linux64_mkl conda-forge
[conda] liblapacke 3.9.0 15_linux64_mkl conda-forge
[conda] mkl 2022.1.0 h84fe81f_915 conda-forge
[conda] mkl-devel 2022.1.0 ha770c72_916 conda-forge
[conda] mkl-include 2022.1.0 h84fe81f_915 conda-forge
[conda] numpy 1.22.4 py310h4ef5377_0 conda-forge
[conda] pytorch 1.12.0 py3.10_cpu_0 pytorch
[conda] pytorch-lightning 1.6.5 pyhd8ed1ab_0 conda-forge
[conda] pytorch-mutex 1.0 cpu pytorch
[conda] torchaudio 0.12.0 py310_cpu pytorch
[conda] torchinfo 1.7.0 pyhd8ed1ab_0 conda-forge
[conda] torchmetrics 0.9.3 pyhd8ed1ab_0 conda-forge
[conda] torchvision 0.13.0 py310_cpu pytorch

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions