-
Notifications
You must be signed in to change notification settings - Fork 83
RFC-0007: Forward AD #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
albanD
wants to merge
2
commits into
pytorch:master
Choose a base branch
from
albanD:rfc-0007
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Dec 11, 2020
Add a rendered link! https://github.com/albanD/rfcs/blob/rfc-0007/RFC-0007-forward-AD.md |
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 14, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 15, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Dec 16, 2020
RFC: pytorch/rfcs#11 This PR add the basic logic to handle forward grad as dual Tensors. It contains the following: - Mechanism to save dual state on a Tensor and clear it up when the dual level ends - C++ and python user facing API - Updated view system that is able to track both forward and backward views The current PR has the following limitations: - Extensive tests are in the next PR in the stack as formulas are needed to write full tests. - Only the manual formulas have been audited and no other formula is actually implemented here (they are in the next PR in the stack) - Only level 0 is allowed for now. This was discussed and agreed that it is not needed for the first version of this PR. - We can save one ViewInfo creation when both the forward and backward views have the same base. This can be done by adding a boolean flag to the DifferentiableViewMeta and extra logic in the `as_view` method. This is left out to keep this PR concise. - We can skip tracking forward views if the base has a forward grad. This can be done by adding extra logic in the `as_view` method. This is left out to keep this PR concise. Reading guide: - Updated view handling in [gen_variable_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-f6553cec68caeaea36f6c8b14ff76a6d39dfd774e0ea9ef2f76e8d81fd9af5df), [VariableTypeUtils.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-ec71cfa45954dece1236c661d170e6341879c5be637f4abf52e826d61b40695a), [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285) (skip code below "[Forward Grad View]" for now), [variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-1604bcd0e4350ed99ec45e437cee7ac9ebe337392c9ea16a236247aeeb35b02bR266-R542) and [custom_function.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-dd85f452082b5bb6612bbc12adb496f8827defa228509f7b493de1d517522d5d). This introduces the new ViewInfo to hold view informations shared for forward and backward. It also updates the differentiable view meta to use this. And it updates the as_view function to handle both forward and backward view. - New forward grad class that handle storing gradients and tracking at each level [forward_grad.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c6c5b9ab2d7e5dde4102495faa1b6bbbfc23aa3e47deb7359c0bfe1eb004c0cb), [forward_grad.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-de2ab54ade7312701850d71a119a4f4ee4b9fc5a9c42a467cdd4e73c033531dd) and [build_variables.bzl](https://github.com/pytorch/pytorch/pull/49097/files#diff-dfdfa2efb17beddfd9094524f95351fd197db6c8857e96b436fb599870359325). EDIT: These files also contain the new flag to globally disable forward AD that allows us to reduce performance issues while this is in development. - Lowest level API and binding between Tensor and AutogradMeta in [TensorBody.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-7554853205392fa743357bf845ecc350a974ec049383248c12daaf2f4de04911), [TensorImpl.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-052bd9150ef8e09289ddf644b5a6830ede49207201cd41728f6d7cc6d9cead94), [TensorImpl.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-a15aae4cf23da44970db7cece62ff981265575c798c62f7b52d87c8809dfe2e1) and the rest of [variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-60e3bfe444e89efc7149f25b38e472710525984789934ab83f1bd5671b8ff285R557-R677) - API to access the forward primal that needs to be a differentiable function (and so in native_functions.yaml) [native_functions.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-2f3dbd85efb9b5172f2264eedd3be47dd765e6ab7cc8bf3ade5e62c28ae35991) [NamedRegistrations.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-69bd3bea510c9b64e1633fa18c3ea63d4b8348dbad3a78ad9de844ab3e43dc1d), [VariableMethodsStub.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-23f5fcb737a2b289811fe0f4b65aef775e7c824b2e629ecd343df51405cd434f), [derivatives.yaml](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_python_functions.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-e4c2f99a2404e98c3586e07425da73008f36b1bada790648a7297af141d37f8c), [gen_trace_type.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-54e0b976027bf8debefb959ff360b89ae93466970c843365b1b3a03806d868ce), [TraceTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-f34636741ad4a23d018e0c289bc750c3bad887b45660e1d6eaf440d234a78fbf) and [part of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R198-R243) - c++ API [autograd.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-349028fbe8291a965a7a263c323b208fe071c35c66179ee997ef84fa81aa4b1e), [autograd.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-a3fe908d67dfec16a1fcde300de68b0701bf68b88db7451f29f2bee255cf30c9) - python binding [init.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-c58a67c85191c22c9b3bb439117d8053edfd9dea839fa010cf967d404c3c630d) - python API [forward_ad.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a4efad4ba18fffdfb264c21e5475997a24a743089a899f8ec1a5ff962c6738d9), [autograd/__init__.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-743abcafd32ad0e69f39ac5a91df4197b7e1921c135cacee7ef6dc829a8a7af8) - c++ and python printing [Formatting.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-881dba501e71662e2e4818b4b016f739b344c8aed2f5edc6b871eda47a2aced0), [_tensor_str.py](https://github.com/pytorch/pytorch/pull/49097/files#diff-a7911f8d5e73adbff914d99fd7818ace2a7030b6a3748abe06ec6fc6e3df9cc3) - Utility for formulas and updated manual functions to respect new view system as well as forward grad [FunctionsManual.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-6378bb6dc81a64dab676d61731341fa5d1088418f32a1473a33a0ccfc2357dc1), [FunctionsManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-4adbd88239afcd60e8198aab65d4f5e43b62314e34b80551e997a1ea503adea5) [rest of VariableTypeManual.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-6e19a1bce8cbdba8714b6e2c794a76bc0864b64a49cfa757cb0b5afdc937d1a4R264-R433) - Ensure SavedVariable save forward grad properly [saved_variable.h](https://github.com/pytorch/pytorch/pull/49097/files#diff-c1b8039d776241abe177d5aa99b79dd9489a9b3e529da8ab24c2e386c1238ae2), [saved_variable.cpp](https://github.com/pytorch/pytorch/pull/49097/files#diff-cc9fba479b5beae06b2eea2e390d17796e0341c5b037a20b5bcaccbb0c341030) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 26, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 26, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 26, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 26, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 26, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Mar 29, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 13, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 13, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 13, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 13, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
…d functional API" RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds the option to check forward grad using gradcheck. The current logic is: - Forward grad is always checked - If the forward evaluation fails because an op is not implemented, the test is silently passing The goal is to make sure that all formulas that are added are properly tested without having to add a new test for each op. The final logic after the next PR that adds the remaining formulas is going to be: - Forward grad is always checked - Failure with not implemented op is an actual failure - Users should set `check_forward=False` if they explicitly don't want to test forward grads (which should not be the case internally). Differential Revision: [D25607502](https://our.internmc.facebook.com/intern/diff/D25607502) [ghstack-poisoned]
albanD
added a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Codegen support to materialize undefined tangents when no value is provided - Tests for basic forward grad components Codegen generated examples. (note that some namings here are not up to date with the latest version, in particular the "legacy" namings that have been removed) For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_t, self_p, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_t, self_p, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto self_p = toLegacyPrimal(self); auto result_new_fw_grad = self_t * self_p.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_t_raw = toLegacyFwGrad(self); auto self_t = self_t_raw.defined() ? self_t_raw : at::zeros_like(toLegacyTensor(self)); auto result_new_fw_grad = at::clone(self_t, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); (void)_any_requires_grad; std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward AD with angle that does not support it."); return result; } ``` Differential Revision: [D25607505](https://our.internmc.facebook.com/intern/diff/D25607505) [ghstack-poisoned]
facebook-github-bot
pushed a commit
to pytorch/pytorch
that referenced
this pull request
Apr 14, 2021
Summary: Pull Request resolved: #49098 RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25607505 Pulled By: albanD fbshipit-source-id: fe2315d587689af1cd5968536fa26c680b8b8829
albanD
added a commit
to albanD/pytorch
that referenced
this pull request
Apr 14, 2021
Summary: Pull Request resolved: pytorch#49098 RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25607505 Pulled By: albanD fbshipit-source-id: fe2315d587689af1cd5968536fa26c680b8b8829
krshrimali
pushed a commit
to krshrimali/pytorch
that referenced
this pull request
May 19, 2021
Summary: Pull Request resolved: pytorch#49098 RFC: pytorch/rfcs#11 This PR adds: - Codegen support to define forward grad formulas and few manual formulas - Codegen support to automatically generate formulas as well as few usage - Tests for basic forward grad components Codegen generated examples. For each of them, the only part that is changed is the if statement before the return checking for fw grad defined. - For manual entry: ```yaml - name: max(Tensor self) -> Tensor self: evenly_distribute_backward(grad, self, result) result: max_forward(self_fw_grad, self, result) ``` ```cpp Tensor max(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<MaxBackward1> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<MaxBackward1>(new MaxBackward1(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::max(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "max"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = max_forward(self_fw_grad, self_primal, result); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } if (grad_fn) { grad_fn->result_ = SavedVariable(result, true); } return result; } ``` - For element wise entry: ```yaml - name: abs(Tensor self) -> Tensor self: grad * self.sgn() result: auto_element_wise ``` ```cpp Tensor abs(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AbsBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AbsBackward>(new AbsBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::abs(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "abs"); if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto self_primal = toLegacyPrimal(self); auto result_new_fw_grad = self_fw_grad * self_primal.sgn(); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For linear entry: ```yaml - name: clone(Tensor self, *, MemoryFormat? memory_format=None) -> Tensor self: grad result: auto_linear ``` ```cpp Tensor clone(const Tensor & self, c10::optional<MemoryFormat> memory_format) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<CloneBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<CloneBackward>(new CloneBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::clone(self_, memory_format); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } if (isFwGradDefined(self)) { auto self_fw_grad = toLegacyFwGrad(self); auto result_new_fw_grad = at::clone(self_fw_grad, memory_format); if (result_new_fw_grad.defined()) { result.set_fw_grad(result_new_fw_grad, /* level */ 0, /* is_inplace_op */ false); } } return result; } ``` - For no entry: ```yaml - name: angle(Tensor self) -> Tensor self: angle_backward(grad, self) ``` ```cpp Tensor angle(const Tensor & self) { auto& self_ = unpack(self, "self", 0); auto _any_requires_grad = compute_requires_grad( self ); std::shared_ptr<AngleBackward> grad_fn; if (_any_requires_grad) { grad_fn = std::shared_ptr<AngleBackward>(new AngleBackward(), deleteNode); grad_fn->set_next_edges(collect_next_edges( self )); grad_fn->self_ = SavedVariable(self, false); } #ifndef NDEBUG c10::optional<Storage> self__storage_saved = self_.has_storage() ? c10::optional<Storage>(self_.storage()) : c10::nullopt; c10::intrusive_ptr<TensorImpl> self__impl_saved; if (self_.defined()) self__impl_saved = self_.getIntrusivePtr(); #endif auto tmp = ([&]() { at::AutoNonVariableTypeMode non_var_type_mode(true); return at::angle(self_); })(); auto result = std::move(tmp); #ifndef NDEBUG if (self__storage_saved.has_value()) AT_ASSERT(self__storage_saved.value().is_alias_of(self_.storage())); if (self__impl_saved) AT_ASSERT(self__impl_saved == self_.getIntrusivePtr()); #endif if (grad_fn) { set_history(flatten_tensor_args( result ), grad_fn); } throw_error_for_complex_autograd(result, "angle"); TORCH_CHECK(!(isFwGradDefined(self)), "Trying to use forward prop with angle that does not support it."); return result; } ``` Test Plan: Imported from OSS Reviewed By: ezyang Differential Revision: D25607505 Pulled By: albanD fbshipit-source-id: fe2315d587689af1cd5968536fa26c680b8b8829
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This rfc is capturing the discussions that happen in the main issue pytorch/pytorch#10223 as well as private conversations with different people.
It presents the design idea and implementation plan for forward AD withing PyTorch.
cc @ezyang @zou3519