-
Couldn't load subscription status.
- Fork 735
Description
🚀 Feature
Add feature that is equivalent to Kaldi's compute-kaldi-pitch-feats.
Motivation
From #679 (comment)
We found that the pitch feature always improved the performance for several tonal languages (e.g., Chinese), and did not degrade the performance for the other languages.
So, espnet1 decided to use log Mel filterbank + pitch features as default.
However, the pitch feature extraction is rather complicated, and we had some difficulties in making this pitch feature extraction fully written by torch functions.
So, espnet2 decided to only use log Mel filterbank features, instead.
We still observe a slight degradation of the ASR performance, but that can be mitigated by some tuning.
We're now moving to espnet2 so we don't need it in the long term, but probably it is quite beneficial for the short term or people keep to use espnet1.