-
Notifications
You must be signed in to change notification settings - Fork 743
Add wav2vec2 ASR Spanish pretrained model from voxpopuli #1924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0938d5e to
66c98e4
Compare
| ) | ||
|
|
||
|
|
||
| def _get_es_labels(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a big deal since this is used internally only, but since we are adding several languages, thoughts on having a dictionary mapping lang -> symbols and using generic function _get_labels(lang) instead of a having a different function for each added language?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr sure we can do that.
My original intention here was to have the separate label object instance object for different pipeline, (meaning, id function would report different values for each get_labels) because labels would be global object and modifying one would not affect the others.
However, now that get_labels returns tuple, which is immutable, and is constructed at runtime, it is no longer relevant. so we can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, so I tried adopting the idea, and I think having them as separate function is better for the readability.
The follwing is only for French, but each language will have around 30 lines. So at the end the dictionary will be around 13 * 30 lines.
def _get_voxpopuli_labels(lang):
labels = {
'fr': (
"|",
"e",
"s",
"n",
"i",
"t",
"r",
"a",
"o",
"u",
"l",
"d",
"c",
"p",
"m",
"é",
"v",
"q",
"f",
"g",
"b",
"h",
"x",
"à",
"j",
"è",
"y",
"ê",
"z",
"ô",
"k",
"ç",
"œ",
"û",
"ù",
"î",
"â",
"w",
"ï",
"ë",
"ü",
"æ",
)
}
return labels[lang]|
I realized that the last dimension of the ASR output label is What do you think? |
Add Spanish ASR from Voxpopuli.