Skip to content

🚀 Feature Request: Add Kaldi Pitch Feature #686

@mthrok

Description

@mthrok

🚀 Feature

Add feature that is equivalent to Kaldi's compute-kaldi-pitch-feats.

Motivation

From #679 (comment)

We found that the pitch feature always improved the performance for several tonal languages (e.g., Chinese), and did not degrade the performance for the other languages.
So, espnet1 decided to use log Mel filterbank + pitch features as default.
However, the pitch feature extraction is rather complicated, and we had some difficulties in making this pitch feature extraction fully written by torch functions.
So, espnet2 decided to only use log Mel filterbank features, instead.
We still observe a slight degradation of the ASR performance, but that can be mitigated by some tuning.
We're now moving to espnet2 so we don't need it in the long term, but probably it is quite beneficial for the short term or people keep to use espnet1.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions