Skip to content

more efficient resample module #908

@small-yellow-duck

Description

@small-yellow-duck

The resampling function in Kaldi that pytorchaudio is currently using has some inefficient for loops and padding steps. I've put together efficient module and evaluated the performance in this notebook:
https://www.kaggle.com/smallyellowduck/fast-audio-resampling-layer-in-pytorch (code is in the the notebook)

edit to make two separate comparisons of the resampling time without the file load time:
Comparison 1: 'kaiser_best' settings in librosa vs 'kaiser_best' setting in the efficient pytorch resampler (should be the same setup)

librosa: 51 s
efficient pytorch resampler: 9 s

Comparison 2: default setting in torchaudio vs window='hann', num_zeros=6 in the efficient pytorch resampler (should be the same set-up)

torchaudio: 10 s
efficient pytorch resampler: 1 s

The performance improvement is most substantial when the input sample rate and output sample rate are not whole number multiple of each other.

I think it would be good for torchaudio to switch to the more efficient resample module.

Before making a PR, perhaps other people have feedback about what the API for the module should look like? I have largely tried to follow the api for the resample method in librosa. Any other additional comments? @vincentqb

    def __init__(self,
                 input_sr, output_sr, dtype,
                 num_zeros = 64, cutoff_ratio = 0.95, filter='kaiser', beta=14.0):
        super().__init__()  # init the base class
        """
        This creates an object that can apply a symmetric FIR filter
        based on torch.nn.functional.conv1d.

        Args:
          input_sr:  The input sampling rate, AS AN INTEGER..
          output_sr:  The output sampling rate, AS AN INTEGER.
          dtype:  The torch dtype to use for computations
          num_zeros: The number of zeros per side in the (sinc*hanning-window)
              filter function.  More is more accurate, but 64 is already quite a lot. 
          cutoff_ratio: The filter rolloff point as a fraction of the Nyquist freq.
          filter: one of ['kaiser', 'kaiser_best', 'kaiser_fast', 'hann']
          beta: parameter for 'kaiser' filter

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions