Skip to content

[Feature] RandomCrop for Audio #416

@haideraltahan

Description

@haideraltahan

🚀 Feature

Similar to RandomCrop in torchvision but implemented for audio.

Motivation

Often we have a model with fixed input but the dataset has variable audio length. One approach to remedy this problem would be to randomly crop the audio to a fixed length. Thereby, allowing us to feed to our model. I would've used RandomCrop in torchvision, however, it only takes PIL Image instead of a Tensor. We need it done on audio across time only not changing the channel dimension of the audio.

Pitch

The implementation is for audio given a tensor. We would return a randomly cropped segment of the audio given a requested audio length. Keeping the same number of channels. Alternatively, if the audio is shorter in length than the requested audio length, we would pad the audio across time.

Additional context

This is my first contribution hence I was not aware that I needed to create an issue before PR (#403). I apologize for that 😕

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions