Fix `amplitude_to_DB` clamping behaviour on batches #1113

jcaw · 2020-12-20T19:19:12Z

amplitude_to_DB currently clamps per-batch, but it should clamp per-item. I've modified it to clamp per-item (when a batch is provided) and I've modified the MFCC transform to take advantage of this new behavior. Tests for amplitude_to_DB are included but I've added no new tests for the modified MFCC transform.

This is just an initial draft. In #994 @vincentqb specified that it should always expect a tensor of shape (..., freq, time), so this implementation assumes the input is a spectrogram and determines whether it's a batch based on the number of dimensions (it assumes a batch when there are more than three dimensions, i.e. more than (channel, freq, time)). I'm not sure if this is the most sensible solution. It restricts the inputs to spectrograms and only allows batches with 4 (or more) dimensions. A batch with shape (item, freq, time) would be treated as a single item. This does seem contradictory if the specified input shape is (..., freq, time).

I also have a (rough) branch here which takes a batch flag to differentiate. Alternatively, batchwise conversions could be pulled into a separate method. (Perhaps amplitude_to_DB_iid?)

Closes #994

When passed a batch, `amplitude_to_DB` was clamping based on the entire batch's maximum value. This was wrong. Apply the clamp based on each item's maximum when a batch is detected. This change requires `amplitude_to_DB` to be restricted to spectrogram inputs only, since items need to have a predictable number of dimensions (in this case, 3) to automatically detect batches. Additional tests are also added to check both batched and unbatched inputs, and ensure items are being clamped correctly.

The `MFCC` transform doesn't need to pack batches, since the mel spectrogram conversion operates fine on batched input (it packs batches itself). The fixed form of `amplitude_to_DB` also now requires an unpacked batch to clamp correctly. Since it's unnecessary, remove packing from the MFFC transform completely.

jcaw · 2020-12-20T19:24:28Z

(The tests in here are a little rough in structure since the infrastructure is due to be replaced)

This should allow `amplitude_to_DB` to compile to torchscript.

jcaw · 2020-12-21T13:47:30Z

The torchscript compilation tripped me up but I'm having trouble getting something that's compatible. This line doesn't want to compile - it's giving me RuntimeError: Cannot emit expr for: (dots). The Ellipsis constant also fails.

Is there a way to either add multiple singleton dimensions to the right like this, or to somehow broadcast from the opposite direction to normal (aligning to the left, instead of the right)?

I've got it working locally doing this, but I don't like it. Doesn't seem optimal:

        if x_db.dim() > 3:
            flat_shape = x_db.size(0), -1
            db_floors = x_db.reshape(flat_shape).amax(dim=1) - top_db
            x_db = torch.max(x_db.view(flat_shape), db_floors.unsqueeze(1)).view(x_db.shape)

(Is there also a document I can reference for full contribution guidelines, so I can run the CI locally?)

vincentqb

About the shape: in practice, saying (..., freq, time) means we expect (freq, time), (channel, freq, time), (batch, freq, time), (batch, channel, freq, time), etc. The ambiguity is in the 3 dimensions as you pointed out: (batch, freq, time) vs (channel, freq, time).

My suggestion is to support the original behavior for (freq, time), (channel, freq, time) and add documentation. We then extend (batch, channel, freq, time) or (..., channel, freq, time) to do clamping "per (channel, freq, time). Therefore, we do not support directly the case (batch, freq, time), though someone (outside the function call) could easily unsqueeze to add a 1 channel.

The torchscript compilation tripped me up but I'm having trouble getting something that's compatible. This line doesn't want to compile - it's giving me RuntimeError: Cannot emit expr for: (dots). The Ellipsis constant also fails.

Is there a way to either add multiple singleton dimensions to the right like this, or to somehow broadcast from the opposite direction to normal (aligning to the left, instead of the right)?

Do you mean to add a dimension like (batch, 1, freq, time) with unsequeeze?

(Is there also a document I can reference for full contribution guidelines, so I can run the CI locally?)

The tests detailed here can be run using pytest. Is that what you meant?

torchaudio/functional/functional.py

test/torchaudio_unittest/functional/functional_cpu_test.py

jcaw · 2021-01-06T20:03:21Z

Awesome, thanks. I'll implement these suggestions now.

Do you mean to add a dimension like (batch, 1, freq, time) with unsequeeze?

More like a version of unsqueeze that can add an arbitrary number of dimensions (rather than just 1), but with the rewrite I think this is unnecessary. Although, I'm still curious how to do this in a way that will compile to torchscript. I've run into the problem before.

The tests detailed here can be run using pytest. Is that what you meant?

I was isolating the amplitude_to_DB tests so it wasn't checking torchscript compilation. I assumed that was tested elsewhere, but including the torchscript consistency tests does the trick.

Test the batch & channel dimensions separately

It's not necessary, just generate random tensors.

Ensure they're independent.

One for each target shape. (The subtests now pass again)

Don't rely on torch.rand to produce good values.

vincentqb

LGTM with minor change to doc, thanks for working on this!

torchaudio/functional/functional.py

jcaw · 2021-01-07T16:14:45Z

No worries!

test/torchaudio_unittest/functional/functional_cpu_test.py

mthrok

Changes to the functional module looks good, but tests need more context for the maintainability.

`AmplitudeToDB` expects items in batches to have 3 dims, including channels. Use 3 dims to test batch consistency.

Also rename them since they're not enclosed in a specific `amplitude_to_DB` test class.

mthrok

Hi @jcaw

Thanks for working on this.
The tests in batch_consistency looks good.
I had a couple of question regarding the tests in Testamplitude_to_DB.
Specifically #Predictability part, I am having difficulty understanding it.

If you do not have time to address the comments, let me know.
I do not want to drag you around too much, so I will move on and address them later.

mthrok · 2021-01-22T16:53:49Z

test/torchaudio_unittest/functional/functional_cpu_test.py

+
+        self.assertEqual(x2, spec)
+
+    def test_amplitude_to_DB_batch(self):


This might be nit-picky but since all the tests in the Testamplitude_to_DB class is about amplitude_to_DB, having each test method name describe "what aspect of the function of the interest is tested" gives better maintenance experience. (Imagine that these code will be most likely maintained by someone without any context, in fact I am regular software engineer and not expert in audio domain, so soon, if I have to come back to this code, it will take a while to figure it out what it is).

Here is my suggestion

merge test_amplitude_to_DB_batch, test_amplitude_to_DB_3dims and test_amplitude_to_DB_2dims, parameterize the shape, then give a good name for the method (also _ensure_reversible because there is no need to extract it).

Something like

@parameterized.expand([ ([2, 2, 100, 100]), ([2, 100, 100]), ([100, 100]), ]) def test_reversible(self, shape): """Round trip between amplitude and db should return the original for various shape This implicitly also tests `DB_to_amplitude`. """ torch.manual_seed(0) spec = torch.rand(*shape) * 200 ...

I don't think it's nitpicky - you're right. I think I was sticking with the original naming convention for the Testamplitude_to_DB class, but the other test classes use the naming scheme you described anyway, which is more sensible.

test/torchaudio_unittest/functional/functional_cpu_test.py

mthrok · 2021-01-22T16:59:55Z

test/torchaudio_unittest/functional/functional_cpu_test.py

+        spec = torch.rand([1, 2, 100, 100]) * 200
+        # Predictability
+        spec[0, 0, 1] = 0
+        spec[0, 0, 0] = 200


What does this mean? What is the need to do this? If the particular value should be used so that the tested function's specific behavior occurs, what about using a non-random tensor?

The decibel cutoff is derived from the smallest value in the spectrogram, so in order to hard-code the cutoff, the smallest value needs to be predictable. There's also a (tiny) chance that no values large enough to be clamped are generated, so I manually set the max.

I default to using a random tensor when possible because it adds entropy to the inputs. It's just a rule of thumb, I can certainly change it.

I didn't really like this structure either, it is confusing. I've rewritten it to scale each spectrogram separately to strictly match the range (0, 200), which seems more sensible. Let me know what you think.

Thanks, the comments you added nicely explain the intention well. I think this is good.

Simpler than maintaining separate tests for each

Also change the way the spectrograms are generated to work with all the given shapes, and apply the correct range to all spectrograms.

mthrok

Hi @jcaw

Thanks for working on this. The change looks good and the tests are now very comprehensive.
There is a conflict with latest master, so please resolve it (or let me know if you are busy, then I will take over).

jcaw · 2021-02-01T21:47:08Z

Would you like me to merge or rebase?

mthrok · 2021-02-01T22:15:54Z

Would you like me to merge or rebase?

(I assume you mean merge commit as I do not think you can press the merge button of the PR.) Either works fine so long as the conflict is resolved. (so that I can click the merge button)
But since you have a bunch of commits on your branch, if you are going to rebase, you will need to squash the commits otherwise resolving the conflict will be difficult.

…ude_to_db_batch_fix

vincentqb · 2021-02-04T16:36:01Z

The tests failing are not related to this pull request. I'll go ahead and merge the pull request. Thank you for the work @jcaw!

jcaw · 2021-02-04T16:43:57Z

No worries! Happy to help.

jcaw added 2 commits December 20, 2020 19:15

facebook-github-bot added the CLA Signed label Dec 20, 2020

Remove range->tuple conversion for torchscript

d3ad3cf

This should allow `amplitude_to_DB` to compile to torchscript.

vincentqb reviewed Jan 5, 2021

View reviewed changes

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

mthrok reviewed Jan 6, 2021

View reviewed changes

jcaw added 14 commits January 6, 2021 20:05

Split basic amplitude_to_DB test

6e43af7

Test the batch & channel dimensions separately

Test amplitude_to_DB without channel dim

f668caf

Always clamp amplitude as a batch

eba14b0

Remove _make_spectrogram helper method

18d57bd

It's not necessary, just generate random tensors.

Redefine test constants inside each test

381ee07

Ensure they're independent.

Use leftmost dim for test name

156f183

Split test_top_db into separate tests

5275442

One for each target shape. (The subtests now pass again)

Capitalise test constants

670fe97

Inline channels variable

fd33b46

Also specify minimum value for tests

44c2259

Don't rely on torch.rand to produce good values.

Reword comment

c9c3a93

Replace assert_allclose with self.assertEqual

3bff1fe

Remove unused power constants

d5070f9

Fix indentation

f68eb8c

vincentqb approved these changes Jan 7, 2021

View reviewed changes

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

torchaudio/functional/functional.py Outdated Show resolved Hide resolved

vincentqb added 2 commits January 7, 2021 10:43

Update torchaudio/functional/functional.py

50ab54b

Update torchaudio/functional/functional.py

2f07717

mthrok mentioned this pull request Jan 7, 2021

Replace pytest's paremeterization with parameterized #1157

Merged

2 tasks

mthrok reviewed Jan 7, 2021

View reviewed changes

test/torchaudio_unittest/functional/functional_cpu_test.py Outdated Show resolved Hide resolved

mthrok reviewed Jan 7, 2021

View reviewed changes

test/torchaudio_unittest/functional/functional_cpu_test.py Outdated Show resolved Hide resolved

mthrok reviewed Jan 7, 2021

View reviewed changes

test/torchaudio_unittest/functional/functional_cpu_test.py Outdated Show resolved Hide resolved

mthrok reviewed Jan 7, 2021

View reviewed changes

test/torchaudio_unittest/functional/functional_cpu_test.py Outdated Show resolved Hide resolved

mthrok suggested changes Jan 7, 2021

View reviewed changes

jcaw added 10 commits January 7, 2021 20:49

Change upcase local variables to lowercase

0c0976c

Set seed when rand is called in tests

2d7746d

Make decibel limit test fail more informatively

ea02627

More descriptive test names, plus overt docstrings

258b8c7

Reference original MFCC clamping issue in test doc

a363968

Move docstring to correct function

da3b808

Pass correct number of item dimensions for a batch

f8e1943

`AmplitudeToDB` expects items in batches to have 3 dims, including channels. Use 3 dims to test batch consistency.

Move dim tests to test_batch_consistency.py

ba7e457

Also rename them since they're not enclosed in a specific `amplitude_to_DB` test class.

Add generic batch test for amplitude_to_DB

6b8d6b7

Expand test docstring

dda8744

jcaw requested a review from mthrok January 20, 2021 19:10

mthrok reviewed Jan 22, 2021

View reviewed changes

mthrok added this to the v0.8 milestone Jan 25, 2021

jcaw added 4 commits February 1, 2021 16:06

Parameterize amplitude_to_DB reversibility tests

8ba7992

Simpler than maintaining separate tests for each

Parameterize amplitude_to_DB top_db tests

62efc26

Also change the way the spectrograms are generated to work with all the given shapes, and apply the correct range to all spectrograms.

Check top_db doesn't over-clamp

f4c7552

Clearer test docstring

5992235

jcaw requested a review from mthrok February 1, 2021 16:27

jcaw added 2 commits February 1, 2021 16:30

Clearer description of scaling operation

6a14029

Correct description of top_db behaviour

27d23a9

mthrok approved these changes Feb 1, 2021

View reviewed changes

Merge branch 'master' of https://github.com/pytorch/audio into amplit…

1c75e86

…ude_to_db_batch_fix

vincentqb merged commit 4e99c12 into pytorch:master Feb 4, 2021


		self.assertEqual(x2, spec)

		def test_amplitude_to_DB_batch(self):

Fix amplitude_to_DB clamping behaviour on batches #1113

Fix amplitude_to_DB clamping behaviour on batches #1113

Uh oh!

Conversation

jcaw commented Dec 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcaw commented Dec 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcaw commented Dec 21, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentqb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jcaw commented Jan 6, 2021

Uh oh!

vincentqb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jcaw commented Jan 7, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mthrok left a comment

Choose a reason for hiding this comment

Uh oh!

mthrok left a comment

Choose a reason for hiding this comment

Uh oh!

mthrok Jan 22, 2021

Choose a reason for hiding this comment

Uh oh!

jcaw Feb 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mthrok Jan 22, 2021

Choose a reason for hiding this comment

Uh oh!

jcaw Feb 1, 2021

Choose a reason for hiding this comment

Uh oh!

mthrok Feb 1, 2021

Choose a reason for hiding this comment

Uh oh!

mthrok left a comment

Choose a reason for hiding this comment

Uh oh!

jcaw commented Feb 1, 2021

Uh oh!

mthrok commented Feb 1, 2021

Uh oh!

vincentqb commented Feb 4, 2021

Uh oh!

jcaw commented Feb 4, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix `amplitude_to_DB` clamping behaviour on batches #1113

Fix `amplitude_to_DB` clamping behaviour on batches #1113

jcaw commented Dec 20, 2020 •

edited

Loading

jcaw commented Dec 20, 2020 •

edited

Loading

jcaw commented Dec 21, 2020 •

edited

Loading

jcaw Feb 1, 2021 •

edited

Loading