Improve Dataset test maintainability/readability

In #821, we improved the dataset tests to work on mocked files. In these changes, we applied the pattern to create mock dataset in `setUpClass`. As we keep improving the test for dataset, the `setUpClass` gets cluttered and became harder to grasp what's going on, so we should refactor the pattern.

In #1126 we extracted the dataset mocking part into a separate function. We can apply the same pattern to the other tests too. In the end, the initialization should look as simple as,

```python
@classmethod
def setUpClass(cls):
    cls.root_dir = cls.get_base_temp_dir()
    cls.data = get_mock_dataset(cls.root_dir)
```

with extracted helper function that creates mock data and returns the expected data.

```python
def get_mock_dataset(root_dir):
    ...
```

* [x] [yesno_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/yesno_test.py#L31-L38)
* [x] [vctk_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/vctk_test.py#L39-L79)
* [x] [tedlium_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/tedlium_test.py#L44-L95)
* [x] [speechcommand_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/speechcommands_test.py#L64-L107)
* [x] [ljspeech_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/ljspeech_test.py#L38-L59)
* [x] [libritts_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/libritts_test.py#L30-L49)
* [x] [librispeech_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/librispeech_test.py#L38-L88)
* [x] [gtzan_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/gtzan_test.py#L27-L45)
* [x] [cmuarctic_test](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/cmuarctic_test.py#L24-L56)

Additionally, for CommonVoice, the mock part is extracted, but there are two helper functions what are very similar each other, so we can refactor that part too.

* [x] [CommonVoice](https://github.com/pytorch/audio/blob/71214b48548b1dcb6ebd581dd36a9d0e60af6837/test/torchaudio_unittest/datasets/commonvoice_test.py#L22-L89)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve Dataset test maintainability/readability #1131

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve Dataset test maintainability/readability #1131

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions