-
Notifications
You must be signed in to change notification settings - Fork 541
Optim-wip: Add Activation Atlas tutorial & functions #579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optim-wip: Add Activation Atlas tutorial & functions #579
Conversation
@NarineK Do you have any ideas on how to generate irregular grids of (x, y) coordinates for tests? I'm note sure how to generate the test inputs I need for the atlas related functions. The test inputs need to let us test the minimum point density parameter as well. Edit: I think that I can just use |
e40da13
to
740fcde
Compare
* Update test_atlas.py
@NarineK So, Lucid calculates the activations' spatial attributions for a model's labels / classes by using this function:
When I looking looking up how to recreate the same basic algorithm, I found that the above may not be currently possible in PyTorch as I think it's Forward AD: https://pytorch.org/docs/stable/autograd.html#torch.autograd.functional.jvp
Looking at PyTorch's development of the required feature makes it seem like it isn't coming anytime soon: pytorch/pytorch#10223, pytorch/rfcs#11 So, is there a way I can use Captum to calculate activations' spatial attributions for the different labels / classes of a model? |
* Vectorize heatmap function. * Add sample count to vec coords. * Add labels for inception v1 model.
I've made progress on getting the attribution stuff working:
|
Thank you @ProGamerGov! This week has been a bit busy. I will look into your PRs next week or on the weekend. |
@NarineK No worries! Atlas attributions may have be added a future PR. I also don't really have a way to display the atlas attribution information either right now (Lucid leaves that to Distill.pub's interactive HTML stuff). I also used the heatmap function to help visualize the sample counts for each atlas visualization. I'm not sure if there's a better way to display this information right now as it's also left to Distill.pub in Lucid. |
Another potential issue is that downloading the ImageNet 2012 training dataset & collecting samples from it for the tutorial notebook could be difficult and time consuming for users. So, it may be wise to host a few pre-collected layer samples somewhere for users to download and use, as the samples are only around 100mb for about 100k samples. Though if we did that, someone might want to collect the full 1 million samples for the sake of atlas visualization quality. Edit: I just tested the size of the samples file for Mixed5a & Mixed5b using I'm also not sure if my |
After some testing, I think that I can make sample collection a lot faster and more memory efficient. Saving every sample batch using |
So, quick update! I resized every image in the ImageNet2012 dataset to 256 and replaced the Pillow library with pillow-simd to speed things up. So, sample collection memory usage and speed is no longer an issue. According to tqdm, collecting samples for the entire dataset of 1281167 images for all of the main layers at once took just under 4 and half hours, and averaged 80 images a second ( The issue now is that I am running out of memory when using UMAP on the resulting samples tensors. I also tested the visualization steps with the sample tensors, and there were no memory issues for those steps. So, the issue seems to be with only doing the UMAP calculations. Edit: I am able to run UMAP on the 1.2 million samples from Mixed4c using an AWS instance with 64GB of RAM. I still ran out of memory with the Mixed5b samples though. Switching to an instance with 128GB of RAM let me run UMAP successfully on the 1.2 million samples for Mixed5b. |
* Remove old code and text for slicing incomplete atlases. * Use more precise language. * Improve the flow of language used between steps.
* Hopefully sample collection is easier to understand this way, as it was previously added as a commented out section to the main activation atlas tutorial. * Improved the description of activation and attributions samples in both visualizing atlases notebooks.
* Also improved the activation atlas sample collection tutorial.
@NarineK There's a small section at the end of the Class Activation Atlas tutorial, which helps demonstrate one of the ways that you can use the information obtained from an activation atlas. For this section I was using two images from the Lucid servers, but they stopped working yesterday and Chris has no idea when it'll be fixed (he doesn't have access to the server anymore). Luckily I had both images saved on my PC just in case something like this happened, and so I uploaded them to a temporary GitHub link (via https://user-images.githubusercontent.com) so that the tutorial will continue to work as normal. |
* Move activation atlas tutorials to their own folder. * Move activation atlas sample collection functions to the sample collection tutorial.
I may close this PR and the class atlas one in favor of newer PRs where I can compress all 95 commits into a single commit, or a small number of commits. Unless that seems like a bad idea? |
@NarineK I have created a cleaned up version of this PR, and Github has it listed as changing 953 lines (including empty space and formatting, so in practice fewer lines were added). Do you want me to include the main tutorial in the same PR, or should I leave it for the PR with the second tutorial? I think the tutorial adds around 444 lines (including empty / formatting lines, and excluding .ipynb formatting related code). |
@ProGamerGov, thank you for the update. I think if the main is 953 and the tutorial adds 444, it's fine to have the tutorial in that same PR. So the total LOC will be 953 + 444 for that PR ? What's the id of that new PR ? |
This PR implements support for Activation Atlases and Class Activation Atlases, based on the Activation Atlas research paper: https://distill.pub/2019/activation-atlas/
Added atlas functions and tests for them.
Added full documentation for all the new atlas functions.
Added new Activation Atlas tutorial. The corresponding Lucid tutorial notebook can be found here.
Added new Class Activation Atlas tutorial. The corresponding Lucid tutorial notebook can be found here.
Both atlas tutorials share some of their text cells and code cells, so keep that in mind when reviewing them.
I re-added the
RandomRotation
transform as the Torchvision one does not accept lists of degree values. Also added tests for the newRandomRotation
transform.Vectorized the
weights_to_heatmap_2d
heatmap function so that it's faster and more efficient. Also improved the tests for the function.Fixed
nchannels_to_rgb
so that it works properly with CUDA inputs, and also improved the tests for the function. This function isn't used by the atlas related code and tutorials, so it's a bit out of scope for this PR.Added
umap-learn
package to the tutorial requirements insetup.py
. See: https://umap-learn.readthedocs.io/en/latest/ & https://arxiv.org/abs/1802.03426 for more information about Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP). UMAP is used for calculating the atlas structure.Added a relaxed version of the MaxPool2d class for calculating neuron class attributions.
Added new Activation Atlas sample collection tutorial. The corresponding Lucid tutorial notebook can be found here. It might be better if we had an easy to download dataset for demonstration purposes. Also, should we have the sample collection functions be inside Captum, or should they only be in the sample collection tutorial?
Atlases might also make a good banner image for Captum as well.