Skip to content

Conversation

NarineK
Copy link
Contributor

@NarineK NarineK commented Nov 18, 2019

This PR adds Gradient SHAP for layer and neuron with documentation and tests cases including tests for data parallel.
It also moves _compute_conv_delta_and_format_attrs function to common.py in order to be able to use it both in DeepLift and Gradient SHAP.
Some of the auxiliary test functions are also shared between Gradient SHAP and DeepLift now.

@@ -454,6 +508,7 @@ def _data_parallel_test_assert(
)
else:
attributions_orig = attr_orig.attribute(**kwargs)
self.setUp()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this for determinism - in order to reinitiate the seeds. I'll pass seed to attribute in a separate PR.

@NarineK NarineK requested review from vivekmig and orionr November 18, 2019 18:11
Copy link
Contributor

@vivekmig vivekmig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 ! Sorry for the delay in reviewing. Just some nits on documentation and suggestions for tests.

It adds white noise to each input sample `n_samples` times, selects a
random baseline from baselines' distribution and a random point along the
path between the baseline and the input, and computes the gradient of outputs
with respect to those selected random points. The final SHAP values represent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can update this to explain the logic for layer, e.g. gradient of output with respect to layer evaluation at selected random point

random baseline from baselines' distribution and a random point along the
path between the baseline and the input, and computes the gradient of outputs
with respect to those selected random points. The final SHAP values represent
the expected values of gradients * (inputs - baselines).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Same here, layer evaluation at inputs and baselines.

tensor must correspond to the batch size. It will be
repeated for each `n_steps` for each randomly generated
input sample.
Note that the gradients are not computed with respect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: attributions are not computed

**attributions** or 2-element tuple of **attributions**, **delta**:
- **attributions** (*tensor* or tuple of *tensors*):
Attribution score computed based on GradientSHAP with
respect to each input feature. Attributions will always be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: with respect to each neuron in input / output of layer


input_baseline_scaled = tuple(
self._scale_input(input, baseline, rand_coefficient)
for input, baseline in zip(inputs, baselines)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to expose this class for direct usage as well? If so, we need to also format baselines (and probably inputs as well) here, otherwise calling this with the default baselines of None will fail. It seems like the same is true for InputBaselineXGradient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this class is not exposed. I was thinking to expose it, but that can be also done in a separate PR.

return_convergence_delta=True,
attribute_to_layer_input=attribute_to_layer_input,
)
assertTensorAlmostEqual(self, attrs[0], expected, 0.005)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can change this to attrs and just obtain the tensor to be consistent with other layer methods?

n_samples = 10

# 10-class classification model
model = SoftmaxModel(num_in, 20, 10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Same here, weights in SoftmaxModel depend on the random initialization, which may not be consistent between versions, so expected attributions could change? Could possibly switch to a model with deterministic weights, e.g. BasicModel_MultiLayer or BasicModel_ConvNet_One_Conv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is used in many places. We can set fixed weights in a separate PR for SoftmaxModel

if callable(baselines):
baselines = baselines(inputs)

baselines = torch.mean(baselines[0], axis=0, keepdim=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it isn't necessarily true that averaging the baseline values and computing Neuron IG with respect to that should match NeuronGradientShap for any non-linear model. Could we instead possibly compute neuron IG attributions with respect to each baseline and average those? I think that should theoretically match NeuronGradientShap with sufficient samples (and small stdev), so hopefully the delta can be reduced with that test as well.

from captum.attr._core.neuron.neuron_integrated_gradients import (
NeuronIntegratedGradients,
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For both neuron and layer, could potentially add a test with multiple input tensors / additional args to confirm these work fine?

baselines=baselines,
additional_forward_args=None,
test_batches=False,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could potentially set alt_device_ids=True on some of these tests / add tests with it, since this verifies that things work appropriately with different device id orderings (essentially that device_ids are being passed appropriately).

@NarineK NarineK force-pushed the add_layer_neuron_gradient_shap branch from 6f866b8 to 8297a0f Compare November 23, 2019 04:35
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NarineK has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NarineK has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

miguelmartin75 pushed a commit to miguelmartin75/captum that referenced this pull request Dec 20, 2019
Summary:
This PR adds Gradient SHAP for layer and neuron with documentation and tests cases including tests for data parallel.
It also moves `_compute_conv_delta_and_format_attrs ` function to `common.py` in order to be able to use it both in DeepLift and Gradient SHAP.
Some of the auxiliary test functions  are also shared between Gradient SHAP and DeepLift now.
Pull Request resolved: pytorch#175

Differential Revision: D18680948

Pulled By: NarineK

fbshipit-source-id: 578c756db09e4069c422dca1d0fb2c360b19d950
miguelmartin75 pushed a commit to miguelmartin75/captum that referenced this pull request Dec 20, 2019
Summary:
This PR adds Gradient SHAP for layer and neuron with documentation and tests cases including tests for data parallel.
It also moves `_compute_conv_delta_and_format_attrs ` function to `common.py` in order to be able to use it both in DeepLift and Gradient SHAP.
Some of the auxiliary test functions  are also shared between Gradient SHAP and DeepLift now.
Pull Request resolved: pytorch#175

Differential Revision: D18680948

Pulled By: NarineK

fbshipit-source-id: 578c756db09e4069c422dca1d0fb2c360b19d950
NarineK added a commit to NarineK/captum-1 that referenced this pull request Nov 19, 2020
Summary:
This PR adds Gradient SHAP for layer and neuron with documentation and tests cases including tests for data parallel.
It also moves `_compute_conv_delta_and_format_attrs ` function to `common.py` in order to be able to use it both in DeepLift and Gradient SHAP.
Some of the auxiliary test functions  are also shared between Gradient SHAP and DeepLift now.
Pull Request resolved: pytorch#175

Differential Revision: D18680948

Pulled By: NarineK

fbshipit-source-id: 578c756db09e4069c422dca1d0fb2c360b19d950
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants