Skip to content

pipeline_wav2letter: CER still 1.0 after 23 epochs. #1016

@Honghe

Description

@Honghe

🐛 Bug

CER still 1.0 after 23 epochs.

To Reproduce

Steps to reproduce the behavior:

python main.py \
    --reduce-lr-valid \
    --dataset-train train-clean-100 \
    --dataset-valid dev-clean \
    --batch-size 128 \
    --learning-rate .6 \
    --momentum .8 \
    --weight-decay .00001 \
    --clip-grad 0. \
    --gamma .99 \
    --hop-length 160 \
    --win-length 400 \
    --n-bins 13 \
    --normalize \
    --optimizer adadelta \
    --scheduler reduceonplateau \
    --epochs 30 \
    --dataset-root /home/ubuntu/Data/ \
    --dataset-folder-in-archive LibriSpeech

Log & error:

python main.py     --reduce-lr-valid     --dataset-train train-clean-100     --dataset-valid dev-clean     --batch-size 128     --learning-rate .6     --momentum .8     --weight-decay .00001     --clip-grad 0.     --gamma .99     --hop-length 160     --win-length 400     --n-bins 13     --normalize     --optimizer adadelta     --scheduler reduceonplateau     --epochs 30     --dataset-root /home/ubuntu/Data/LibriSpeechConformer     --dataset-folder-in-archive LibriSpeech
/home/ubuntu/audio/torchaudio/extension/extension.py:14: UserWarning: torchaudio C++ extension is not available.
  warnings.warn('torchaudio C++ extension is not available.')
INFO:root:Namespace(batch_size=128, checkpoint='', clip_grad=0.0, dataset_folder_in_archive='LibriSpeech', dataset_root='/home/ubuntu/Data/LibriSpeechConformer', dataset_train=['train-clean-100'], dataset_valid=['dev-clean'], decoder='greedy', distributed=False, epochs=30, eps=1e-08, freq_mask=0, gamma=0.99, hop_length=160, jit=False, learning_rate=0.6, momentum=0.8, n_bins=13, normalize=True, optimizer='adadelta', progress_bar=False, reduce_lr_valid=True, rho=0.95, scheduler='reduceonplateau', seed=0, start_epoch=0, time_mask=0, type='mfcc', weight_decay=1e-05, win_length=400, workers=0, world_size=8)
INFO:root:Start time: 2020-11-07 19:26:31.371587
INFO:root:Number of parameters: 23282529
INFO:root:Checkpoint: not found
INFO:root:Epoch: 0
/home/ubuntu/miniconda3/envs/torch1.7/lib/python3.8/site-packages/torch/functional.py:515: UserWarning: stft will require the return_complex parameter be explicitly  specified in a future PyTorch release. Use return_complex=False  to preserve the current behavior or return_complex=True to return  a complex output. (Triggered internally at  /opt/conda/conda-bld/pytorch_1603729096996/work/aten/src/ATen/native/SpectralOps.cpp:653.)
  return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore
/home/ubuntu/miniconda3/envs/torch1.7/lib/python3.8/site-packages/torch/functional.py:515: UserWarning: The function torch.rfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.fft or torch.fft.rfft. (Triggered internally at  /opt/conda/conda-bld/pytorch_1603729096996/work/aten/src/ATen/native/SpectralOps.cpp:590.)
  return _VF.stft(input, n_fft, hop_length, win_length, window,  # type: ignore
INFO:root:Target: the freshmen are sim    Output:
INFO:root:Target: i crosed to the star    Output:
{"name": "train", "epoch": 0, "cer over target length": 1.0, "cumulative cer": 24618.0, "total chars": 24618.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 4701.0, "total words": 4701.0, "wer": 0.0, "cumulative wer over target length": 0.0, "lr": 0.6, "batch size": 128, "n_channel": 13, "n_time": 1671, "dataset length": 128.0, "iteration": 1.0, "loss": 8.766070365905762, "cumulative loss": 8.766070365905762, "average loss": 8.766070365905762, "iteration time": 3.9171156883239746, "epoch time": 3.9171156883239746}
......
{"name": "train", "epoch": 22, "cer over target length": 1.0, "cumulative cer": 5150188.0, "total chars": 5150188.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 985867.0, "total words": 985867.0, "wer": 0.0, "cumulative wer over target length": 0.0, "lr": 0.6, "batch size": 128, "n_channel": 13, "n_time": 1669, "dataset length": 28416.0, "iteration": 222.0, "loss": 3.2989859580993652, "cumulative loss": 731.2055156230927, "average loss": 3.29371853884276, "iteration time": 1.0962834358215332, "epoch time": 246.33143091201782}
INFO:root:Target: mister quilter is th    Output:
INFO:root:Target: nor is mister quilte    Output:
INFO:root:Target: something of their t    Output:
INFO:root:Target: presently it stole b    Output:
INFO:root:Target: and already this ast    Output:
INFO:root:Target: for a time the death    Output:
INFO:root:Target: pop it's a course       Output:
INFO:root:Target: he does and for once    Output:
INFO:root:Target: pavel knocked him ov    Output:
INFO:root:Target: peter crouching in t    Output:
INFO:root:Target: she was indistinctly    Output:
INFO:root:Target: but put it on the ta    Output:
INFO:root:Target: there were a god man    Output:
INFO:root:Target: it is surprising how    Output:
INFO:root:Target: not until the heaven    Output:
INFO:root:Target: with a long stained     Output:
INFO:root:Target: that divine word who    Output:
INFO:root:Target: the very emperors ha    Output:
INFO:root:Target: ben over the ground     Output:
INFO:root:Target: very wel then you're    Output:
INFO:root:Target: misus bozle was disp    Output:
INFO:root:Target: and had the case ben    Output:
INFO:root:Target: before the two clerg    Output:
INFO:root:Target: another floging to m    Output:
INFO:root:Target: and whoever thou art    Output:
INFO:root:Target: only name it whateve    Output:
INFO:root:Target: i do not pretend to     Output:
INFO:root:Target: about daylight on su    Output:
INFO:root:Target: the dome of saint pa    Output:
INFO:root:Target: saint paul is a sain    Output:
INFO:root:Target: hary feling pride bu    Output:
INFO:root:Target: he intended to leave    Output:
INFO:root:Target: we've got a hard dri    Output:
INFO:root:Target: humph grunted curley    Output:
INFO:root:Target: at the corner he rem    Output:
INFO:root:Target: he checked the sily     Output:
INFO:root:Target: you don't understand    Output:
INFO:root:Target: he gave the discusio    Output:
INFO:root:Target: our brigade was fear    Output:
INFO:root:Target: there were no breast    Output:
INFO:root:Target: that wil do misus pr    Output:
INFO:root:Target: randal pased this ov    Output:
{"name": "validation", "epoch": 22, "cumulative loss": 69.75250315666199, "dataset length": 2688.0, "iteration": 21.0, "cer over target length": 1.0, "cumulative cer": 280923.0, "total chars": 280923.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 54008.0, "total words": 54008.0, "wer": 0.0, "cumulative wer over target length": 0.0, "average loss": 3.3215477693648565, "validation time": 14.262725830078125}
INFO:root:Epoch: 23
INFO:root:Target: was no longer in evi    Output:
INFO:root:Target: that no executioner     Output:
{"name": "train", "epoch": 23, "cer over target length": 1.0, "cumulative cer": 23564.0, "total chars": 23564.0, "cer": 0.0, "cumulative cer over target length": 0.0, "wer over target length": 1.0, "cumulative wer": 4511.0, "total words": 4511.0, "wer": 0.0, "cumulative wer over target length": 0.0, "lr": 0.6, "batch size": 128, "n_channel": 13, "n_time": 1667, "dataset length": 128.0, "iteration": 1.0, "loss": 3.298595905303955, "cumulative loss": 3.298595905303955, "average loss": 3.298595905303955, "iteration time": 1.0997705459594727, "epoch time": 1.0997705459594727}
INFO:root:Target: they go wrong at the    Output:

Expected behavior

Normal training.

Environment


Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 11.0
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.1 LTS (x86_64)
GCC version: (Ubuntu 8.4.0-3ubuntu2) 8.4.0
Clang version: Could not collect
CMake version: version 3.18.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce RTX 2080 Ti
GPU 1: GeForce RTX 2080 Ti

Nvidia driver version: 450.80.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.7.0
[pip3] torchaudio==commit 5e54c77

[pip3] torchvision==0.8.1
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               11.0.221             h6bb024c_0
[conda] magma-cuda102             2.5.2                         1    pytorch
[conda] mkl                       2020.2                      256
[conda] mkl-include               2020.2                      256
[conda] mkl-service               2.3.0            py38he904b0f_0
[conda] mkl_fft                   1.2.0            py38h23d657b_0
[conda] mkl_random                1.1.1            py38h0573a6f_0
[conda] numpy                     1.19.1           py38hbc911f0_0
[conda] numpy-base                1.19.1           py38hfa32c7d_0
[conda] pytorch                   1.7.0           py3.8_cuda11.0.221_cudnn8.0.3_0    pytorch
[conda] torchaudio                commit 5e54c77
[conda] torchvision               0.8.1                py38_cu110    pytorch

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions