-
Couldn't load subscription status.
- Fork 734
Closed
Labels
Description
In #1108 and #1141, I am adding in-memory decoding and encoding. This allows us to apply codecs to audio Tensor, like the following way.
fileobj = io.BytesIO()
torchaudio.save(fileobj, waveform, …, format=”mp3”, compression=9)
fileobj.seek(0)
waveform, _ = torchaudio.load(fileobj)
# Note: depending on the format, the size of the tensor could be different,
# so some post processing might be necessaryWhich practically gives the same result as
sox input.wav -C 9 temp.mp3
sox temp.mp3 output.wavI am thinking of adding this codecs application as part of torchaudio’s feature. Before starting working on API specification and engineering load map, I would like to hear from the community what kind of feature would be helpful to your use case.
If you would like to use codecs as data augmentation or if there are papers that use this kind of technique. Please leave comment.
cc @mravanelli @sw005320 @pzelasko @faroit @mpariente @danpovey
astaff