-
Notifications
You must be signed in to change notification settings - Fork 739
README updates #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
README updates #180
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes -- this looks good to me. Was there anything else that needed to be updated as part of this PR? Given the scope mentioned in description, some of the items below could be part of a separate PR.
- Include contribution guidelines
- dimension naming
- optimized for ML not general signal processing
- optimized for pytorch with GPU support
- can be made a layer ("trainable")
- can read flac and all kinds of other formats because of sox? (if we want to keep sox)
|
I think we mention formats supported already https://github.com/pytorch/audio/pull/180/files#diff-04c6e90faac2675aa89e2176d2eec7d8R15 |
vincentqb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the PR just got merge, let's do a quick follow-up then :)
| the audio domain. By supporting PyTorch, torchaudio will follow the same philosophy | ||
| of providing strong GPU acceleration, having a focus on trainable features through | ||
| the autograd system, and having consistent style (tensor names and dimension names). | ||
| Therefore, it will be primarily a machine learning library and not a general signal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use present tense.
| * MuLawDecode: (channel, time) -> (channel, time) | ||
| * Resample: (channel, time) -> (channel, time) | ||
| With torchaudio being a machine learning library and built on top of PyTorch, | ||
| torchaudio is standardized around the following naming conventions. In particular, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove "In particular" here.
| With torchaudio being a machine learning library and built on top of PyTorch, | ||
| torchaudio is standardized around the following naming conventions. In particular, | ||
| tensors are assumed to have channel as the first dimension and time as the last | ||
| dimension (when applicable). This makes it consistent with PyTorch's dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a quick mention why we are consistent with PyTorch.
If we intend to eventually remove sox, writing supported formats is a promise that we'll be breaking. IMHO we'll want to make sox optional, so that won't be a problem. Nonetheless, maybe we should mention that the support comes from sox there? |
We want to explain why we are choosing the conventions mentioned in #169.