-
Notifications
You must be signed in to change notification settings - Fork 741
Description
🚀 Feature
Similar to RandomCrop in torchvision but implemented for audio.
Motivation
Often we have a model with fixed input but the dataset has variable audio length. One approach to remedy this problem would be to randomly crop the audio to a fixed length. Thereby, allowing us to feed to our model. I would've used RandomCrop in torchvision, however, it only takes PIL Image instead of a Tensor. We need it done on audio across time only not changing the channel dimension of the audio.
Pitch
The implementation is for audio given a tensor. We would return a randomly cropped segment of the audio given a requested audio length. Keeping the same number of channels. Alternatively, if the audio is shorter in length than the requested audio length, we would pad the audio across time.
Additional context
This is my first contribution hence I was not aware that I needed to create an issue before PR (#403). I apologize for that 😕