Roadmap ahead for torchaudio

There are many exciting work elements that are planned for torchaudio.

* Provide support for large scale training.
    * Support a large-scale training reference task using wav2vec on librivox, and offer a pre-trained version of the model.
    * Support the emergence of audio specific transformer models by exploring abstractions would be beneficial to provide.
* Extend support for speech recognition.
    * Investigate the addition of beam search, and a 4-gram language model, see [here](https://github.com/pytorch/audio/issues/913) and [here](https://github.com/pytorch/audio/issues/1146), to reduce the word error rate in the existing pipeline.
    * ✅ Support in-memory codec encoding and decoding, see [here](https://github.com/pytorch/audio/issues/1115), to support codec based data augmentation.
    * ✅ Add the Kaldi pitch feature, see [here](https://github.com/pytorch/audio/pull/1063), that is used in the audio community.
    * Implement a prototype of WFST-based ASR model, using [GTN](https://github.com/facebookresearch/gtn) or [K2](https://github.com/k2-fsa/k2), see [here](https://github.com/pytorch/audio/issues/1154).
    * Add RNN transducer loss, see [here](https://github.com/pytorch/audio/pull/1137) and [follow-up](https://github.com/pytorch/audio/issues/1240), to train RNN transducer models efficiently.
* Provide high-performance data loading and media decoding experience.
    * Provide fast audio I/O module, see [here](https://github.com/pytorch/audio/issues/1000).
    * Provide audio streaming abstractions with examples, see [here](https://github.com/pytorch/audio/issues/1072).
* Improve our codebase
    * ✅ Create libtorchaudio by building the C++ extension outside of Python, see [here](https://github.com/pytorch/audio/issues/1154).

The goal of torchaudio is to accelerate research through novel, production-ready building blocks. As such, we would love to hear feedback on the plan, so make sure to reach out to us, @mthrok and @vincentqb!

cc [internal](https://fb.workplace.com/groups/pytorch.dev/permalink/832555280656287/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Roadmap ahead for torchaudio #1196

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Roadmap ahead for torchaudio #1196

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions