-
Notifications
You must be signed in to change notification settings - Fork 739
[WIP] apply codec-based data augmentation #1194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @mthrok, I have couple of questions:
For the test, I have done something like this class CodecTestBase:
@parameterized.expand(list(itertools.product(
["mp3", "wav", "flac"],
[96, 128, 160, 192, 224, 256, 320],
)), name_func=name_func)
def test_codec(self, format, compression):
torch.random.manual_seed(42)
waveform = torch.rand(2, 44100 * 1)
augmented_waveform = F.apply_codec(waveform, format, channels_first=True, compression=compression)
# TODO: maybe check the channels (number of frames can change depending on format like mp3)
|
Hi @AzizCode92 Thanks for working on this. The implementation looks good so far.
By the way, looking at the docstring you added, I am wondering if we should make this feature require |
|
Hi @mthrok, Thank you for your feedback.
Hmm, do the two parameters |
|
Now I am working on the unit-test. So for the smoke test, my idea is to save the augmented data, load it again and assert the sample_rate, the expected and the found data. Is it the proper way to do it?? thanks |
|
Hi @AzizCode92
Now I think that we can restrict the function to "sox_io" backend only although the code itself is compatible with "soundfile" backend, the lack of support for
My view is that since |
|
BTW, regarding the sample rate, I realized that some codecs require specific sample rate. (like AMR NB requires 8kHz) So we will need to incorporate this into the design later. (which will be a separate work than this PR) For example, if I convert a WAV file into AMR NB with |
f7c6a2e to
b33c539
Compare
This PR addresses the issue #1183