-
Notifications
You must be signed in to change notification settings - Fork 7.2k
replace new_like with wrap_like #6718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
datumbox
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @pmeier! I only have a couple of nit comments but nothing major.
@vfdev-5 my understanding is that it's still worth proceeding with some of the ideas from #6681 to reduce the __torch_function__ calls. Could you confirm that this PR doesn't affect the approach you are favouring to solve this?
vfdev-5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, let's move on !
Thanks @pmeier
Conflicts: torchvision/prototype/transforms/_auto_augment.py torchvision/prototype/transforms/_color.py torchvision/prototype/transforms/functional/_augment.py
|
As discussed offline, there are multiple ways we can approach the interface design. Nothing is set in stone here. We'll move with what I have proposed with the strong possibility of refactoring later. The performance gain is too high to hold this up with bike shedding. |
Summary: * replace new_like with wrap_like * fix videos * revert casting in favor of ignoring mypy Reviewed By: NicolasHug Differential Revision: D40427465 fbshipit-source-id: 04b854225fe6a886cbe468b1277a0b73ca273885
Throughout this comment I'm using
Imageas proxy for all of our features for simplicityThis is my take on reducing the overhead transforms v2 has. Currently, we are using the idiom
vision/torchvision/prototype/features/_image.py
Line 136 in 7eb5d7f
everywhere to wrap a plain tensor into the image feature. Doing so results in multiple
__torch_function__calls as detailed in #6681. Similar to the constructor, theImage.new_likemethod accepts arbitrarydata: Anyas input and thus has to go through the constructor every time.However, we never call it without a tensor input. Plus, whenever we pass
dtypetoImage.new_like, it is not to change thedtypeof the tensor to be wrapped, but rather to retain it:vision/torchvision/prototype/features/_bounding_box.py
Line 158 in 7eb5d7f
Taking this one step further, this also means that the
new_likename is somewhat misleading. Yes, one gets a newfeatures.Imageobject, but unlike thetorch.*_likemethods, we don't get a new storage unless thedtypeordeviceis changed.This PR proposed to fix the above by refactoring
Image.new_liketoImage.wrap_like. Opposed tonew_like,wrap_likeonly takes a tensor to be wrapped as well the metadata for the specific type, i.e.color_spaceforfeatures.Image. This prevents the need to go through the constructor and results in no__torch_function__calls at all:We can estimate the impact of this change on one classification training:
Although the overhead is quite low with roughly 20 µs per sample or 1 µs per call, the enormous amount of calls during a full training blows this up to a significant increase. However, running the same benchmark on
mainyieldsTo put it in words, this PR achieves roughly a 10x reduction of the overhead. And the only thing we lose is the ability to pass arbitrary data to the wrapping function or change the
dtypeanddevicein the process. Both of which we don't do and I currently don't see a use case for it either.Note that this doesn't affect the ability to pass arbitrary data to the constructor. This is still supported. Plus, in contrast to the proposed
wrap_likefunction, the constructor may also process the metadata, like guessing thecolor_spaceif none is passed tofeatures.Image, whilewrap_likeonly takes the correct type or takes the value from the reference.