You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default, we have set ``find_unused_parameters=True`` for compatibility reasons that have been observed in the past (refer to the `discussion <https://github.com/PyTorchLightning/pytorch-lightning/discussions/6219>`_ for more details).
725
+
When enabled, it can result in a performance hit and can be disabled in most cases. Read more about it `here <https://pytorch.org/docs/stable/notes/ddp.html#internal-design>`_.
726
+
727
+
.. tip::
728
+
It applies to all DDP strategies that support ``find_unused_parameters`` as input.
729
+
730
+
.. code-block:: python
731
+
732
+
from pytorch_lightning.strategies import DDPStrategy
`NCCL <https://developer.nvidia.com/nccl>`__ is the NVIDIA Collective Communications Library that is used by PyTorch to handle communication across nodes and GPUs. There are reported benefits in terms of speedups when adjusting NCCL parameters as seen in this `issue <https://github.com/PyTorchLightning/pytorch-lightning/issues/7179>`__. In the issue, we see a 30% speed improvement when training the Transformer XLM-RoBERTa and a 15% improvement in training with Detectron2.
771
+
772
+
NCCL parameters can be adjusted via environment variables.
773
+
774
+
.. note::
775
+
776
+
AWS and GCP already set default values for these on their clusters. This is typically useful for custom cluster setups.
By default, we have set ``find_unused_parameters=True`` for compatibility reasons that have been observed in the past (refer to the `discussion <https://github.com/PyTorchLightning/pytorch-lightning/discussions/6219>`_ for more details).
85
-
When enabled, it can result in a performance hit and can be disabled in most cases. Read more about it `here <https://pytorch.org/docs/stable/notes/ddp.html#internal-design>`_.
86
-
87
-
.. tip::
88
-
It applies to all DDP strategies that support ``find_unused_parameters`` as input.
89
-
90
-
.. code-block:: python
91
-
92
-
from pytorch_lightning.strategies import DDPStrategy
`NCCL <https://developer.nvidia.com/nccl>`__ is the NVIDIA Collective Communications Library that is used by PyTorch to handle communication across nodes and GPUs. There are reported benefits in terms of speedups when adjusting NCCL parameters as seen in this `issue <https://github.com/PyTorchLightning/pytorch-lightning/issues/7179>`__. In the issue, we see a 30% speed improvement when training the Transformer XLM-RoBERTa and a 15% improvement in training with Detectron2.
112
-
113
-
NCCL parameters can be adjusted via environment variables.
114
-
115
-
.. note::
116
-
117
-
AWS and GCP already set default values for these on their clusters. This is typically useful for custom cluster setups.
0 commit comments