Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924

mthrok · 2021-10-22T20:46:14Z

Add Spanish ASR from Voxpopuli.

carolineechen · 2021-10-25T00:28:30Z

torchaudio/pipelines/_wav2vec2/utils.py

    )
+
+
+def _get_es_labels():


not a big deal since this is used internally only, but since we are adding several languages, thoughts on having a dictionary mapping lang -> symbols and using generic function _get_labels(lang) instead of a having a different function for each added language?

tl;dr sure we can do that.

My original intention here was to have the separate label object instance object for different pipeline, (meaning, id function would report different values for each get_labels) because labels would be global object and modifying one would not affect the others.

However, now that get_labels returns tuple, which is immutable, and is constructed at runtime, it is no longer relevant. so we can do that.

hmm, so I tried adopting the idea, and I think having them as separate function is better for the readability.

The follwing is only for French, but each language will have around 30 lines. So at the end the dictionary will be around 13 * 30 lines.

def _get_voxpopuli_labels(lang): labels = { 'fr': ( "|", "e", "s", "n", "i", "t", "r", "a", "o", "u", "l", "d", "c", "p", "m", "é", "v", "q", "f", "g", "b", "h", "x", "à", "j", "è", "y", "ê", "z", "ô", "k", "ç", "œ", "û", "ù", "î", "â", "w", "ï", "ë", "ü", "æ", ) } return labels[lang]

mthrok · 2021-10-25T15:33:56Z

@nateanl

I realized that the last dimension of the ASR output label is 1, which only happened once in the original training dataset. I believe this is a sort of mistake, and we can exclude such dimension from the model like the case of <pad> (#1914).

What do you think?

cat dict.es_char.txt
| 1518049
e 1077702
a 877225
o 694767
s 619500
n 565808
r 519910
i 513128
l 393015
d 382689
c 359006
t 358015
u 323059
p 236137
m 235987
b 88865
q 85869
y 78375
g 77028
v 67765
h 62163
ó 58918
f 53486
í 36803
á 30274
j 25768
z 22749
ñ 19116
é 16605
x 11350
ú 7163
k 1741
w 484
ü 256
1 1        <<--

pytorch-probot bot added the ciflow/default label Oct 22, 2021

facebook-github-bot added the CLA Signed label Oct 22, 2021

mthrok mentioned this pull request Oct 22, 2021

Add pretrained weights from Voxpopuli #1920

Open

30 tasks

mthrok force-pushed the pretrain-es branch from b476430 to 82ea389 Compare October 22, 2021 21:15

nateanl approved these changes Oct 22, 2021

View reviewed changes

mthrok force-pushed the pretrain-es branch 2 times, most recently from 0938d5e to 66c98e4 Compare October 23, 2021 01:04

mthrok marked this pull request as ready for review October 23, 2021 01:05

mthrok requested review from carolineechen and hwangjeff October 23, 2021 01:05

carolineechen approved these changes Oct 25, 2021

View reviewed changes

mthrok force-pushed the pretrain-es branch from 66c98e4 to 17fb35e Compare October 25, 2021 17:40

mthrok added 2 commits October 26, 2021 22:03

Add pretrained Spanish ASR from voxpopuli (pytorch#1919)

1e787fc

Remove label 1

4561532

mthrok force-pushed the pretrain-es branch from 17fb35e to 4561532 Compare October 27, 2021 02:05

mthrok merged commit 3a59931 into pytorch:main Oct 27, 2021

mthrok deleted the pretrain-es branch October 27, 2021 02:06

mthrok pushed a commit to mthrok/audio that referenced this pull request Dec 13, 2022

spaces instead of newlines (pytorch#1924)

458285f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924

Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924

Uh oh!

mthrok commented Oct 22, 2021 •

edited

Loading

Uh oh!

carolineechen Oct 25, 2021

Uh oh!

mthrok Oct 25, 2021

Uh oh!

mthrok Oct 25, 2021

Uh oh!

mthrok commented Oct 25, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924

Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924

Uh oh!

Conversation

mthrok commented Oct 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carolineechen Oct 25, 2021

Choose a reason for hiding this comment

Uh oh!

mthrok Oct 25, 2021

Choose a reason for hiding this comment

Uh oh!

mthrok Oct 25, 2021

Choose a reason for hiding this comment

Uh oh!

mthrok commented Oct 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mthrok commented Oct 22, 2021 •

edited

Loading

mthrok commented Oct 25, 2021 •

edited

Loading