-
Couldn't load subscription status.
- Fork 3.6k
Description
🐛 Bug
The behavior of the validation dataloader sampling changes if you use the CombinedLoader with ddp in comparison to using a single dataloader. The CombinedLoader does not split and distribute validation dataset on the gpus, but all gpus get the full validation set. The problem is resolved when you explicitly pass the DistributedSampler to the dataloader.
Please reproduce using the BoringModel
https://gist.github.com/lukashermann/b19964ba32c9bde241be3e54deea01ad
To Reproduce
To reproduce run the file and check cmd line output.
Expected behavior
Single dataloader:
device cuda:1 [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63]
device cuda:0 [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62]
Combined dataloader:
device cuda:0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
device cuda:1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63]
Is this behavior intended?
Environment
- CUDA:
- GPU:
- GeForce RTX 2080 Ti
- GeForce RTX 2080 Ti
- available: True
- version: 11.1 - Packages:
- numpy: 1.19.2
- pyTorch_debug: False
- pyTorch_version: 1.8.0
- pytorch-lightning: 1.3.0rc1
- tqdm: 4.53.0 - System:
- OS: Linux
- architecture:
- 64bit
- ELF
- processor: x86_64
- python: 3.8.5
- version: removed reduce on non-loss outputs from dp #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021