Skip to content

Conversation

@yoyolicoris
Copy link
Contributor

@yoyolicoris yoyolicoris commented Jul 22, 2021

Resolves #1476, relates to #1561.

@mthrok
I want to take some discusssions about the design before dive into it.

The batching feature I have imagined, is when input is a multi-dimensional tensor with shape (..., bank_size, ..., time), and filter coefficients are 2D matrices with shape(bank_size, filter_order). The filters are applied individually at the batch filterbanks dimension, so the output shape is also (..., bank_size, ..., time).

In PyTorch convention, the first dimension are usually considered as batch dimension, so the input shape should be (bank_size, ..., time). To incoporate along with filter banks feature, maybe we should add a boolean parameter to enable it so the output won't become (bank_size, ..., bank_size, time)?

Actually, my original design from #1561, is putting the batch filterbanks dimension at the second from last axis (..., bank_size, time). To enable filter banks support but not batching support, user can unsqueeze the second from last axis, so the input shape is (..., 1, time), and the output will be (..., bank_size, time) if filter coefficients are (bank_size, filter_order). In this way, no extra parameter is needed.

To state it more clearly:

batching

Waveform  (Nd tensor): ... x bank_size x time
coeffs    (2d tensor):       bank_size x filter_order
Result    (Nd tensor): ... x bank_size x time

filterbanks

Waveform  (Nd tensor): ... x     1     x time
coeffs    (2d tensor):       bank_size x filter_order
Result    (Nd tensor): ... x bank_size x time

Any ideas?

@nateanl
Copy link
Member

nateanl commented Jul 22, 2021

Hi @yoyololicon! Thanks for the proposal. Please correct me if my understanding is wrong.

Waveform  (Nd tensor): ... x      1     x time
coeffs    (2d tensor):       batch_size x filter_order
Result    (Nd tensor): ... x batch_size x time

In this use case, do you want to apply a batch of filters to a single waveform as a batch training example? Or is the batch_size of the training example as 1 and batch_size is the number of filters in a separate dimension?

@mthrok
Copy link
Contributor

mthrok commented Jul 22, 2021

@yoyololicon Thanks for the suggestion. In your description, can you replace the batch_size of filter bank with something like bank_size, and add a separate note if they same, such as bank_size(==batch_size)?

The way I think of "batching" (of samples) in PyTorch or other DL framework is "processing multiple sample data points at the same time for the sake of computational efficiency". And, the resulting data points should be same regardless of whether they were batched or not. (or batched in a different order.)
(I think that the latest on this front is vmap, which allows users to write a operation without worrying about batch dimension)

From this view point, the first one seems to violate the independency.

The batching feature I have imagined, is when input is a multi-dimensional tensor with shape (..., batch_size, ..., time), and filter coefficients are 2D matrices with shape(batch_size, filter_order). The filters are applied individually at the batch dimension, so the output shape is also (..., batch_size, ..., time).

If different filter is applied to different samples, if the input batch is shuffled along the batch dimension, then the result will be different before/after shuffling.

And I think this "Applying each filter to each sample individually" is somewhat analogous to the group convolution, and if we want to support this kind of "group"-ing, then I think we should introduce a notion of input group dimension. (like a channel)

So the way I think of batch/bank shpae semantics so far is as follow. I think this can handle the cases where batch dimension or channel dimension is 1, or where both batch and channel dimension present.

1. no bank (coeffs is 1D)

1.1. without batch dim

waveform (1D): (time)
coeffs   (1D): (filter_order, )
result   (1D): (time)

1.2. with batch dim

The same filter is applied independently to each sample along batch dimension

waveform (2D): (batch, time)
coeffs   (1D): (filter_order, )
result   (2D): (batch, time)

1.3. with batch and possibly more dims

The same filter is applied independently to each dim of each sample. Similar to pack batch semantics used throughout torchaudio.

waveform (ND): (batch, ..., time)
coeffs   (1D): (filter_order, )
result   (ND): (batch, ..., time)

2. With bank (coeff is 2D), without grouping (say, group=None)

2.1. without batch dim

I think we should always require waveform to be more than 1D. If the input is a single channel signal, the shape should be [1, time]. This is handled in 2.2.
waveform (1D): (time)
coeffs (2D): (bank, filter_order)
result (2D): (bank, time)

2.2. with batch dim

waveform (2D): (batch, time)
coeffs   (2D): (bank, time)
result   (3D): (batch, bank, time)

2.3. with batch and possibly more dims

waveform    (ND): (batch, ..., time)
coeffs      (2D): (bank, filter_order)
result   (N+1 D): (batch, ..., bank, time)

3. With bank and grouping (say, group>0)

3.1. without batch

waveform (2D): (channel, time)
coeffs   (2D): (bank, time)  # channel // group == bank
result   (2D): (bank, time)

3.2. with batch and possibly more dims

waveform  (ND): (batch, ..., channel, time)
coeffs    (2D): (bank, time)  # channel // group == bank
result (N+1 D): (batch, ..., bank, time)

@yoyolicoris
Copy link
Contributor Author

@nateanl
Sorry for confusing. The batch_size I said is actually the number of filters indeed, the "real" batch dimension(s) are those dots ....
I have updated the description, should be easier to understand.

@mthrok
Yes, it's actually similiar to group convolution! But instead of adding a groups parameter, we can just restrict it to the case when groups == out_channels and in_channels == groups or 1.

Re-write section 2 examples from above base on this idea:

2. With bank (coeff is 2D), without grouping

If the waveform is over 1D, user should manually unsqueeze the second from last axis before passing the waveform.

2.1. without batch dim

I actually think this case is valid though, and the current lfilter produce the exact same behavior.

waveform (1D): (time)
coeffs   (2D): (bank, filter_order)
result   (2D): (bank, time)

2.2. with batch dim

waveform (2+1 D): (batch, 1, time)
coeffs      (2D): (bank, time)
result      (3D): (batch, bank, time)

2.3. with batch and possibly more dims

waveform (N+1 D): (batch, ..., 1, time)
coeffs      (2D): (bank, filter_order)
result   (N+1 D): (batch, ..., bank, time)

3. With bank and grouping

3.1. without batch

waveform (2D): (bank, time)
coeffs   (2D): (bank, time) 
result   (2D): (bank, time)

3.2. with batch and possibly more dims

waveform (ND): (batch, ..., bank, time)
coeffs   (2D): (bank, time) 
result   (ND): (batch, ..., bank, time)

@mthrok
Copy link
Contributor

mthrok commented Jul 27, 2021

@yoyololicon

If the waveform is over 1D, user should manually unsqueeze the second from last axis before passing the waveform.

So regarding the unsqueeze vs additional parameter approach, I think adding an additional parameter for group behavior is better. The reasons are as follow;

  1. If we use unsqueeze, we need to 1. figure out which operation mode and then 2. validate if the given shape is valid under the operation mode. I think this will make the validation logic more complicated.
  2. This grouping behavior is somewhat unique. I personally think its unconventional in DL libraries. (let me know if you know similar patterns in other libraries) So making it different operation mode with explicit argument will make UX better.

@yoyolicoris
Copy link
Contributor Author

yoyolicoris commented Jul 27, 2021

So regarding the unsqueeze vs additional parameter approach, I think adding an additional parameter for group behavior is better. The reasons are as follow;

  1. If we use unsqueeze, we need to 1. figure out which operation mode and then 2. validate if the given shape is valid under the operation mode. I think this will make the validation logic more complicated.
  2. This grouping behavior is somewhat unique. I personally think its unconventional in DL libraries. (let me know if you know similar patterns in other libraries) So making it different operation mode with explicit argument will make UX better.

@mthrok
Understood. I'm ok with the parameter approach.

I suggest the parameter group can just be a boolean value so when group == True, channels should always be equal to bank (examples from Sec. 3), because each channel is filtered with its own filter and there's no cross channel operation, making the condition of channel // group == bank redundant.

@mthrok
Copy link
Contributor

mthrok commented Jul 28, 2021

I suggest the parameter group can just be a boolean value so when group == True, channels should always be equal to bank (examples from Sec. 3), because each channel is filtered with its own filter and there's no cross channel operation, making the condition of channel // group == bank redundant.

Sure, I am okay with boolean. Also we can think of a better naming than group.

@yoyolicoris yoyolicoris marked this pull request as ready for review August 2, 2021 03:27
@yoyolicoris yoyolicoris changed the title Draft: lfilter with batch support feat: lfilter with batch support Aug 2, 2021
Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @yoyololicon

Thanks for the PRs. Please let me know your thoughts on my comments.

batching (bool, optional): Activate when coefficients are in 2D. If ``True``, then waveform should be at least
2D, and the size of second axis from last should equals to ``num_filters``.
The output can be expressed as ``output[..., i, :] = lfilter(waveform[..., i, :],
a_coeffs[i], b_coeffs[i], clamp=clamp, batching=False)``. (Default: ``False``)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output can be expressed as output[..., i, :] = lfilter(waveform[..., i, :], a_coeffs[i], b_coeffs[i], clamp=clamp, batching=False)

Can you remind the necessity to have interweaving behavior, when the above can achieve the same result?
I guess computational efficiency is one, and autograd support is the second reason?

Copy link
Contributor Author

@yoyolicoris yoyolicoris Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason is computational efficiency, this remove an extra for-loop. I can benchmark this feature if needed.
Both way support autograd though.

assert waveform.ndim > 1
assert waveform.shape[-2] == a_coeffs.shape[0]
else:
waveform = torch.stack([waveform] * a_coeffs.shape[0], -2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shape manipulation looks very complicated, and I wonder if it is because we have pack/unpack batch in the following section. If so, is there a way to contain pack/unpack batch bellow into batching mechanism?

Copy link
Contributor Author

@yoyolicoris yoyolicoris Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's not related to pack/unpack batch, and I haven't come up a way to integrate these two parts together.
Waveform and coefficients should have equal number of channels before pack batch, and this line is just replicating the waveform to match the desire shape.

@yoyolicoris yoyolicoris requested a review from mthrok August 6, 2021 08:59
Lower delays coefficients are first, e.g. ``[b0, b1, b2, ...]``.
Must be same size as a_coeffs (pad with 0's as necessary).
clamp (bool, optional): If ``True``, clamp the output signal to be in the range [-1, 1] (Default: ``True``)
batching (bool, optional): Activate when coefficients are in 2D. If ``True``, then waveform should be at least
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
batching (bool, optional): Activate when coefficients are in 2D. If ``True``, then waveform should be at least
batching (bool, optional): Effective only when coefficients are 2D. If ``True``, then waveform should be at least

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks,. Should I make another MR to fix the docstring?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have time, please!

Copy link
Contributor

@mthrok mthrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@mthrok mthrok merged commit 8094751 into pytorch:main Aug 10, 2021
@yoyolicoris yoyolicoris deleted the feat/lfilter-batch-filter branch August 10, 2021 23:57
mthrok pushed a commit to mthrok/audio that referenced this pull request Dec 13, 2022
…h#1638)

* tensorqs_tutorial -> tensors_tutorial

* autogradqs_tutorial -> autograds_tutorial

* spaces

* fix link to torch.Tensor.scatter_ documentation

* undo file renaming

* autogradqs_tutorial.py in readme

Co-authored-by: Holly Sweeney <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add batch dimension inside the computation of lfilter

4 participants