Skip to content

Conversation

@jamarshon
Copy link
Contributor

We want to explain why we are choosing the conventions mentioned in #169.

@jamarshon jamarshon requested review from cpuhrsch and vincentqb July 29, 2019 20:23
Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes -- this looks good to me. Was there anything else that needed to be updated as part of this PR? Given the scope mentioned in description, some of the items below could be part of a separate PR.

  • Include contribution guidelines
  • dimension naming
  • optimized for ML not general signal processing
  • optimized for pytorch with GPU support
  • can be made a layer ("trainable")
  • can read flac and all kinds of other formats because of sox? (if we want to keep sox)

@jamarshon
Copy link
Contributor Author

@cpuhrsch cpuhrsch merged commit ae3070c into pytorch:master Jul 30, 2019
Copy link
Contributor

@vincentqb vincentqb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the PR just got merge, let's do a quick follow-up then :)

the audio domain. By supporting PyTorch, torchaudio will follow the same philosophy
of providing strong GPU acceleration, having a focus on trainable features through
the autograd system, and having consistent style (tensor names and dimension names).
Therefore, it will be primarily a machine learning library and not a general signal
Copy link
Contributor

@vincentqb vincentqb Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use present tense.

* MuLawDecode: (channel, time) -> (channel, time)
* Resample: (channel, time) -> (channel, time)
With torchaudio being a machine learning library and built on top of PyTorch,
torchaudio is standardized around the following naming conventions. In particular,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove "In particular" here.

With torchaudio being a machine learning library and built on top of PyTorch,
torchaudio is standardized around the following naming conventions. In particular,
tensors are assumed to have channel as the first dimension and time as the last
dimension (when applicable). This makes it consistent with PyTorch's dimensions.
Copy link
Contributor

@vincentqb vincentqb Jul 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a quick mention why we are consistent with PyTorch.

@vincentqb
Copy link
Contributor

vincentqb commented Jul 30, 2019

I think we mention formats supported already https://github.com/pytorch/audio/pull/180/files#diff-04c6e90faac2675aa89e2176d2eec7d8R15

If we intend to eventually remove sox, writing supported formats is a promise that we'll be breaking. IMHO we'll want to make sox optional, so that won't be a problem. Nonetheless, maybe we should mention that the support comes from sox there?

@jamarshon jamarshon deleted the mani branch July 30, 2019 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants