@@ -67,14 +67,49 @@ def apply_effects_tensor(
6767 """Apply sox effects to given Tensor
6868
6969 Args:
70- tensor: Input 2D Tensor.
71- sample_rate: Sample rate
72- effects: List of effects.
73- channels_first: Indicates if the input Tensor's dimension is
70+ tensor (torch.Tensor) : Input 2D Tensor.
71+ sample_rate (int) : Sample rate
72+ effects (List[List[str]]) : List of effects.
73+ channels_first (bool) : Indicates if the input Tensor's dimension is
7474 ``[channels, time]`` or ``[time, channels]``
7575
76+ Returns:
77+ Tuple[torch.Tensor, int]: Resulting Tensor and sample rate.
78+ The resulting Tensor has the same ``dtype`` as the input Tensor, and
79+ the same channels order. The shape of the Tensor can be different based on the
80+ effects applied. Sample rate can also be different based on the effects applied.
81+
82+ Examples:
83+ >>> # Defines the effects to apply
84+ >>> effects = [
85+ ... ['gain', '-n'], # normalises to 0dB
86+ ... ['pitch', '5'], # 5 cent pitch shift
87+ ... ['rate', '8000'], # resample to 8000 Hz
88+ ... ]
89+ >>> # Generate pseudo wave:
90+ >>> # normalized, channels first, 2ch, sampling rate 16000, 1 second
91+ >>> sample_rate = 16000
92+ >>> waveform = 2 * torch.rand([2, sample_rate * 1]) - 1
93+ >>> waveform.shape
94+ torch.Size([2, 16000])
95+ >>> waveform
96+ tensor([[ 0.3138, 0.7620, -0.9019, ..., -0.7495, -0.4935, 0.5442],
97+ [-0.0832, 0.0061, 0.8233, ..., -0.5176, -0.9140, -0.2434]])
98+ >>> # Apply effects
99+ >>> waveform, sample_rate = apply_effects_tensor(
100+ ... wave_form, sample_rate, effects, channels_first=True)
101+ >>> # The new waveform his sampling rate 8000, 1 second.
102+ >>> # normalization and channel order are preserved
103+ >>> waveform.shape
104+ torch.Size([2, 8000])
105+ >>> waveform
106+ tensor([[ 0.5054, -0.5518, -0.4800, ..., -0.0076, 0.0096, -0.0110],
107+ [ 0.1331, 0.0436, -0.3783, ..., -0.0035, 0.0012, 0.0008]])
108+ >>> sample_rate
109+ 8000
110+
76111 Notes:
77- This function works in the way very similar to ``` sox` `` command, however there are slight
112+ This function works in the way very similar to ``sox`` command, however there are slight
78113 differences. For example, ``sox`` commnad adds certain effects automatically (such as
79114 ``rate`` effect after ``speed`` and ``pitch`` and other effects), but this function does
80115 only applies the given effects. (Therefore, to actually apply ``speed`` effect, you also
@@ -95,15 +130,42 @@ def apply_effects_file(
95130 """Apply sox effects to the audio file and load Tensor
96131
97132 Args:
98- path: Path to the audio file.
99- effects: List of effects.
100- normalize: When ``True``, this function always return ``float32``, and sample values are
133+ path (str) : Path to the audio file.
134+ effects (List[List[str]]) : List of effects.
135+ normalize (bool) : When ``True``, this function always return ``float32``, and sample values are
101136 normalized to ``[-1.0, 1.0]``. If input file is integer WAV, giving ``False`` will change
102137 the resulting Tensor type to integer type. This argument has no effect for formats other
103138 than integer WAV type.
104- channels_first: When True, the returned Tensor has dimension ``[channel, time]``.
139+ channels_first (bool) : When True, the returned Tensor has dimension ``[channel, time]``.
105140 Otherwise, the returned Tensor's dimension is ``[time, channel]``.
106141
142+ Returns:
143+ Tuple[torch.Tensor, int]: Resulting Tensor and sample rate.
144+ If ``normalize=True``, the resulting Tensor is always ``float32`` type.
145+ If ``normalize=False`` and the input audio file is of integer WAV file, then the
146+ resulting Tensor has corresponding integer type. (Note 24 bit integer type is not supported)
147+ If ``channels_first=True``, the resulting Tensor has dimension ``[channel, time]``,
148+ otherwise ``[time, channel]``.
149+
150+ Examples:
151+ >>> # Defines the effects to apply
152+ >>> effects = [
153+ ... ['gain', '-n'], # normalises to 0dB
154+ ... ['pitch', '5'], # 5 cent pitch shift
155+ ... ['rate', '8000'], # resample to 8000 Hz
156+ ... ]
157+ >>> # Apply effects and load data with channels_first=True
158+ >>> waveform, sample_rate = apply_effects_file("data.wav", effects, channels_first=True)
159+ >>> waveform.shape
160+ torch.Size([2, 8000])
161+ >>> waveform
162+ tensor([[ 5.1151e-03, 1.8073e-02, 2.2188e-02, ..., 1.0431e-07,
163+ -1.4761e-07, 1.8114e-07],
164+ [-2.6924e-03, 2.1860e-03, 1.0650e-02, ..., 6.4122e-07,
165+ -5.6159e-07, 4.8103e-07]])
166+ >>> sample_rate
167+ 8000
168+
107169 Notes:
108170 This function works in the way very similar to ``sox`` command, however there are slight
109171 differences. For example, ``sox`` commnad adds certain effects automatically (such as
0 commit comments