-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
🐛 Bug
During finetuning with complex models, a call to BaseFinetuning.unfreeze_and_add_param_group can raise the following warning:
/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py:2882: UserWarning: optimizer contains a parameter group with duplicate parameters; in future, this will cause an error; see github.com/pytorch/pytorch/issues/40967 for more information
exec(code_obj, self.user_global_ns, self.user_ns)
What happens is that due to the way the BaseFinetuning flattens the model before collecting the parameters, it's possible to list the same parameters twice. It iterates over all of the .modules(), but fails to filter it so that only leaf nodes in the model are returned when the model is nested and have custom blocks.
One example of model where the problem happens is:
Sequential(
(encoder): Sequential(
(0): ConvBlock(
(conv): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1))
(act): ReLU()
(bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): ConvBlock(
(conv): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1))
(act): ReLU()
(bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(decoder): ConvBlock(
(conv): Conv2d(128, 10, kernel_size=(3, 3), stride=(1, 1))
(act): ReLU()
(bn): BatchNorm2d(10, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
A call to BaseFinetuning.flatten_modules(model) using the model above returns both the leaf nodes (conv2d, relu, batchnorm) and the ConvBlocks, listing all the layers twice.
Please reproduce using the BoringModel
The BoringModel is simple enough that the issue doesn't appear, so I added the most simple model that I could make reproducing the issue.
https://colab.research.google.com/drive/1-YR26kK41kCCNmaL8MYVCL8831FvbHNa?usp=sharing
Expected behavior
The BaseFinetuning don't try to add the same parameter twice when unfreezing complex models.
Environment
Bug reproduced using colab, with CPU only runtime
- CUDA:
- GPU:
- available: False
- version: 10.1
- Packages:
- numpy: 1.19.5
- pyTorch_debug: False
- pyTorch_version: 1.8.1+cu101
- pytorch-lightning: 1.2.7
- tqdm: 4.41.1
- System:
- OS: Linux
- architecture:
- 64bit
- processor: x86_64
- python: 3.7.10
- version: Proposal for help #1 SMP Thu Jul 23 08:00:38 PDT 2020