-
Notifications
You must be signed in to change notification settings - Fork 739
Closed
Labels
Description
Which new datasets should we offer and prioritize in torchaudio?
I want to follow-up on #31 and a few of the recent PRs. Instead of aiming to have an exhaustive list of datasets, we should focus on a few important/common/representative dataset that can serve as templates for users to easily implement datasets of their choosing. All datasets should already be free/accessible/online/common with license permitting linking to them.
torchaudio currently has:
- commonvoice
- librispeech
- ljspeech
- speechcommands
- vctk
- yesno
Current open proposals:
- Free Universal Sound Separation (FUSS) Add Free Universal Sound Separation (FUSS) Dataset #534
- CMU_ARCTIC Add CMU_ARCTIC dataset #512
cpuhrsch