- 
                Notifications
    You must be signed in to change notification settings 
- Fork 736
Closed
Labels
Description
🚀 Feature
torchaudio.transforms.SpectralCentroid
Motivation
This is a common audio transform, included in Librosa.
Pitch
A differentiable SpectralCentroid would be useful for audio neural networks.
Alternatives
Here is the librosa implementation.
Additional context
Here is a simplified implementation I have made, based upon the librosa implementation:
def spectral_centroid(y, sr, n_fft, hop_length=512):
    S = torchaudio.transforms.Spectrogram(n_fft=n_fft, hop_length=hop_length, power=1.0)(y)
#    freq = fft_frequencies(sr=sr, n_fft=n_fft)
    freq = torch.linspace(0,
                       float(sr) / 2,
                       int(1 + n_fft//2))
                       #endpoint=True)
    if freq.ndim == 1:
        freq = freq.reshape((-1, 1))
    def tl1norm(S):
        return S / torch.sum(torch.abs(S), axis=0)
        
    # Column-normalize S
    return torch.sum(freq * tl1norm(S),
                  axis=0, keepdims=True)