-
Notifications
You must be signed in to change notification settings - Fork 743
Issue 764: Convert mp3 to wav #773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 764: Convert mp3 to wav #773
Conversation
This reverts commit a9e89d1.
|
I tried to avoid converting to The temp file that was written out had these properties: |
|
@mthrok I'm not sure what's wrong with the binary macos conda. It seems to be related to some specification mismatch that my code should not have affected. |
mthrok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for the most part. Left some comments.
test/test_sox_compatibility.py
Outdated
| output_waveform = F.bass_biquad(waveform, sample_rate, gain, central_freq, q) | ||
|
|
||
| self.assertEqual(output_waveform, sox_output_waveform, atol=1.5e-4, rtol=1e-5) | ||
| self.assertEqual(output_waveform, sox_output_waveform, atol=1e-3, rtol=1e-4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is dues to edge effect, and the trick to avoid reducing tolerance here is to scale the intensity of the generated whitenoise (say x0.9).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried this was bass and still got a few errors exceeding 1e-3.
E AssertionError: False is not true : Tensors failed to compare as equal! With rtol=0.0001 and atol=0.001, found 136 element(s) (out of 220500) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.0013254880905151367 (0.8861521482467651 vs. 0.88482666015625), which occurred at index (0, 47224).
test/test_sox_compatibility.py
Outdated
| common_utils.TempDirMixin.setUp(self) | ||
| common_utils.TorchaudioTestCase.setUp(self) | ||
|
|
||
| NOISE_SAMPLE_RATE = 44100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use 8000 Hz? though 44100 Hz is common in audio, many speech-related tasks only use 16k or 8k Hz and 8k Hz will reduce test run time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sox deemph seems to only work on 44.1kHz or 48kHz
deemph: sample rate must be 44100 (audio-CD) or 48000 (DAT)
I am not sure where you got |
Yes, that happens when PyTorch's binary build fails. You can ignore it. |
|
Thanks! |
|
@mthrok Check out the latest. I think this approach captures what you want but it introduces quite a number of instabilities and we would need to relax atol thresholds for 4-5 tests. Why do you think that is? Do you think there's some issue with the float -> int quantization that differs? |
Codecov Report
@@ Coverage Diff @@
## master #773 +/- ##
=======================================
Coverage 89.53% 89.53%
=======================================
Files 32 32
Lines 2617 2617
=======================================
Hits 2343 2343
Misses 274 274 Continue to review full report at Codecov.
|
|
Thanks! |
Co-authored-by: Aleksandr Panchul (CSI Interfusion Inc) <[email protected]>
Convert mp3 to wav or on the fly generation.