-
Notifications
You must be signed in to change notification settings - Fork 738
Closed
Description
Why do torchaudio.compliance.kaldi.fbank and torchaudio.compliance.kaldi.spectrogram have so large dither default parameter (=1.0)? It very often just noises full output.
It's common to use dither around 0, e.g 0.00001 in QuartzNet, Jasper -- near to SOTA ASR models (https://github.com/NVIDIA/NeMo/blob/master/examples/asr/configs/quartznet15x5.yaml).
I want to notice that even in torchaudio tutorial we have dither = 0.0: https://pytorch.org/tutorials/beginner/audio_preprocessing_tutorial.html.
Also look at this issue and how it was resolved: #157
Metadata
Metadata
Assignees
Labels
No labels