-
Couldn't load subscription status.
- Fork 734
Description
tl;dr: how to migrate to new backend/interface in 0.7
-
If you are using
torchaudioin Linux/macOS environments, please usetorchaudio.set_audio_backend("sox_io")to adopt to the upcoming changes. -
If you are in Windows environment, please set
torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = Falseand reload backend to use the new interface. -
Note that this ships with some bug-fixes for formats other than 16bit signed integer WAV, so you might experience some BC-breaking changes as described in the section below.
News
[UPDATE] 2021/03/06
- All the migration works have been completed on master branch.
[UPDATE] 2021/02/12
- Added
bits_per_sampleandencodingargument (replaceddtype) tosavefunction.
[UPDATE] 2021/01/29
- Added
encodingtoAudioMetaData
[UPDATE] 2021/01/22
- Added
formatargument toload/info/savefunction. bits_per_sampletoAudioMetaData
[UPDATE] 2020/10/21
- Added Description of
"soundfile"backend legacy interface.
[UPDATE] 2020/09/18
- Added migration guide for
"soundfile"backend. - Moved the phase when
"soundfile"backend signatures change from 0.9.0 to 0.8.0 so that they match with"sox_io"backend, which becomes default in 0.8.0.
[UPDATE] 2020/09/17
- Added information on deprecation of native
libsoxstructures such assignalinfo_tandencoding_t.
Improving I/O for correct and consistent experience
This is an announcement for users that we are making backward-incompatible changes to I/O functions of torchaudio backends from 0.7.0 release throughout 0.9.0 release.
What is affected?
-
Public APIs
torchaudio.load- [Linux/macOS] By switching the default backend from
"sox"backend to"sox_io"backend in 0.8.0, loading audio formats other than 16bit signed integer WAV returns the correct tensor. - [Linux/macOS/Windows] The signature of
"soundfile"backend will be change in 0.8.0 to match that of"sox_io"backend.
- [Linux/macOS] By switching the default backend from
torchaudio.save- [Linux/macOS] By switching to
"sox_io"backend, saving audio files will no longer degrade the data. The supported format will be restricted to the tested formats only. (please refer to the doc for the supported formats.) - [Linux/macOS/Windows] The signature of
"soundfile"backend will be change in 0.8.0 to match that of"sox_io"backend.
- [Linux/macOS] By switching to
torchaudio.info- [Linux/macOS/Windows] The signature of
"soundfile"backend will be change in 0.8.0 to match that of"sox_io"backend.
- [Linux/macOS/Windows] The signature of
torchaudio.load_wav- will be removed in 0.9.0. (
loadfunction withnormalize=Falsewill provide the same functionality)
- will be removed in 0.9.0. (
-
Internal APIs
The following functions/classes of"sox"backend were accidentally exposed and will be removed in 0.9.0. There is no replacement for them. Please usesave/load/infofunctions.torchaudio.save_encinfo- will be removed in 0.9.0
torchaudio.get_sox_signalinfo_t- will be removed in 0.9.0
torchaudio.get_sox_encodinginfo_t- will be removed in 0.9.0
torchaudio.get_sox_option_t- will be removed in 0.9.0
torchaudio.get_sox_bool- will be removed in 0.9.0
The signatures of the other backends are not planned to be changed within this overhaul plan.
- Classes
torchaudio.SignalInfoandtorchaudio.EncodingInfo- will be replaced with
AudioMetaDatain 0.8.0 for"soundfile"backend - will be removed in 0.9.0
- will be replaced with
Why
There are currently three backends in torchaudio. (Please refer to the documentation for the detail.)
"sox" backend is the original backend, which binds libsox with pybind11. The functionalities (load / save / info) of this backend are not well-tested and have number of issues. (See #726).
Fixing these issues in backward-compatible manner is not straightforward. Therefore while we were adding TorchScript-compatible I/O functions, we decided to deprecate this original "sox" backend and replace it with the new backend ("sox_io" backend), which is confirmed not to have those issues.
When we are switching the default backend for Linux/macOS from "sox" to "sox_io" backend, we would like to align the interface of "soundfile" backend, therefore, we introduced the new interface (not a new backend to reduce the number of public API) to "soundfile" backend.
When / What Changes
The following is the timeline for the planned changes;
| Phase | Expected Release | Expected Changes |
|---|---|---|
| 1 | 0.7.0 (Oct 2020) |
|
| 2 | 0.8.0 (March 2021) |
|
| 3 | 0.9.0 |
|
Planned signature changes of "soundfile" backend in 0.8.0
The following is the planned signature change of "soundfile" backend functions in 0.8.0 release.
info function
AudioMetaData implementation can be found here. The placement of the AudioMetaData might be changed.
| ~0.7.0 | 0.8.0 |
def info(
filepath: str,
) ->
Tuple[SignalInfo, EncodingInfo] |
def info(
filepath: str,
format: Optional[str],
) ->
AudioMetaData |
Migration
The values returned from info function will be changed. Please use the corresponding new attributes.
| ~0.7.0 | 0.8.0 |
si, ei = torchaudio.info(filepath)
sample_rate = si.rate
num_frames = si.length
num_channels = si.channels
precision = si.precision
bits_per_sample = ei.bits_per_sample
encoding = ei.encoding |
metadata = torchaudio.info(filepath)
sample_rate = metadata.sample_rate
num_frames = metadata.num_frames
num_channels = metadata.num_channels
bits_per_sample = metadata.bits_per_sample
encoding = metadata.encoding |
Note If the attribute you are using is missing, file a Feature Request issue.
load function
| ~0.7.0 | 0.8.0 |
def load(
filepath: str,
# out: Optional[Tensor] = None,
# To be removed.
# Currently not used
# Raise AssertionError if given
normalization: Optional[bool] = True,
# To be renamed to normalize.
# Currently only accept True
# Raise AssertionError if given
channels_first: Optional[bool] = True,
num_frames: int = 0,
offset: int = 0,
# To be renamed to frame_offset
# signalinfo: SignalInfo = None,
# To be removed
# Currently not used
# Raise AssertionError if given
# encodinginfo: EncodingInfo = None,
# To be removed
# Currently not used
# Raise AssertionError if given
filetype: Optional[str] = None
# To be removed
# Currently not used
) -> Tuple[Tensor, int] |
def load(
filepath: str,
frame_offset: int = 0,
num_frames: int = -1,
normalize: bool = True,
channels_first: bool = True,
format: Optional[str] = None, # only required for file-like object input
) -> Tuple[Tensor, int] |
Migration
Please change the argument names;
normalization->normalizeoffset->frame_offst
| ~0.7.0 | 0.8.0 |
waveform, sample_rate = torchaudio.load(
filepath,
normalization=normalization,
channels_first=channels_first,
num_frames=num_frames,
offset=offset,
) |
waveform, sample_rate = torchaudio.load(
filepath,
frame_offset=frame_offset,
num_frames=num_frames,
normalize= normalization,
channels_first=channels_first,
) |
save function
| ~0.7.0 | 0.8.0 |
def save(
filepath: str,
src: Tensor,
sample_rate: int,
precision: int = 16,
# moved to `bits_per_sample` argument
channels_first: bool = True
) |
def save(
filepath: str,
src: Tensor,
sample_rate: int,
channels_first: bool = True,
compression: Optional[float] = None,
# Added only for compatibility.
# soundfile does not support compression option
# Raises Warning if not None
format: Optional[str] = None,
encoding: Optoinal[str] = None,
bits_per_sample: Optional[int] = None,
) |
Migration
| ~0.7.0 | 0.8.0 |
torchaudio.save(
filepath,
waveform,
sample_rate,
channels_first
) |
torchaudio.save(
filepath,
waveform,
sample_rate,
channels_first,
bits_per_sample=16,
)
# You can also designate audio format with `format` and configure the encoding with `compression` and `encoding`. See https://pytorch.org/audio/master/backend.html#save for the detail |
BC-breaking changes
Read and write operations on the formats other than WAV 16-bit signed integer were affected by small bugs.