Skip to content

Conversation

@mthrok
Copy link
Contributor

@mthrok mthrok commented Feb 5, 2021

This PR adds Kaldi Pitch Feature detailed in "A pitch extraction algorithm tuned for automatic speech recognition".

The interface is mostly same as compute-kaldi-pitch-feats CLI. Batch support is added via at::parallel_for function.

As the function binds the custom built libkaldi, it only supports CPU (and float32) at the moment.

About the custom built Kaldi

Since kaldi is a large library, I choose to use only a subset of it by adding custom build process. In addition to that, to reduce the dependencies and make the build process, I reused the BLAS package PyTorch uses. For this, I added custom interface of Kaldi's matrix libraries. Therefor the algorithm runs on torch::Tensor class. However, some parts of the algorithms requires direct access to memory, so the resulting function is not differentiable. Also the resulting code is very slow (60x) at the moment due to the overhead of slicing operation. (Kaldi's feature implementations work element-wise, while PyTorch operates faster when operations are vectorized.)

Supersedes #1063

@mthrok mthrok added this to the v0.8 milestone Feb 5, 2021
@mthrok mthrok force-pushed the rebase-pitch-feature branch 9 times, most recently from d9ebbd3 to 80f234d Compare February 8, 2021 19:24
@mthrok mthrok force-pushed the rebase-pitch-feature branch 4 times, most recently from ae26d83 to e1ca9ee Compare February 9, 2021 15:52
auto mat = M.tensor_;
if (trans == kNoTrans) {
tensor_ =
beta * tensor_ + torch::diag(torch::mm(mat, mat.transpose(1, 0)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is called in a tight-loop with small Tensors, you might fare better inlining the operation and using the underlying tensor data pointer instead of calling into torch::mm repeatedly.

Also note that torch::mm does parallelization, but at::parallel_for disables nested parallelism for OpenMP (to avoid oversubscription), which is the default threadpool for PyTorch.

@mthrok mthrok marked this pull request as ready for review February 9, 2021 22:03
@mthrok mthrok force-pushed the rebase-pitch-feature branch from e1ca9ee to b158e7c Compare February 9, 2021 22:22
@mthrok mthrok merged commit 7ee1c46 into pytorch:master Feb 9, 2021
@mthrok mthrok deleted the rebase-pitch-feature branch February 9, 2021 23:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants