Skip to content

Commit 58ff133

Browse files
authored
Merge branch 'master' into bugfix/no_ret_warn
2 parents 1a72853 + 484dce1 commit 58ff133

File tree

137 files changed

+1645
-1673
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

137 files changed

+1645
-1673
lines changed

.github/workflows/events-recurent.yml renamed to .github/workflows/events-recurrent.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: Recurent events
1+
name: Recurrent events
22

33
# https://jasonet.co/posts/scheduled-actions/
44
# https://github.202132.xyzmunity/t/distinct-job-for-each-schedule/17811/2

CHANGELOG.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,19 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
1818
- Added warning when predict returns no predictions ([#6139](https://github.com/PyTorchLightning/pytorch-lightning/pull/6139))
1919

2020

21+
- Added arg to `self.log` that enables users to give custom names when dealing with multiple dataloaders ([#6274](https://github.com/PyTorchLightning/pytorch-lightning/pull/6274))
22+
23+
2124
### Changed
2225

26+
- Changed the order of `backward`, `step`, `zero_grad` to `zero_grad`, `backward`, `step` ([#6147](https://github.com/PyTorchLightning/pytorch-lightning/pull/6147))
27+
28+
29+
- Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale ([#6262](https://github.com/PyTorchLightning/pytorch-lightning/pull/6262))
30+
31+
32+
- Renamed `pytorch_lightning.callbacks.swa` to `pytorch_lightning.callbacks.stochastic_weight_avg` ([#6259](https://github.com/PyTorchLightning/pytorch-lightning/pull/6259))
33+
2334

2435
### Deprecated
2536

@@ -49,6 +60,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
4960
- Removed `mode='auto'` from `EarlyStopping` ([#6167](https://github.com/PyTorchLightning/pytorch-lightning/pull/6167))
5061

5162

63+
- Removed deprecated `LightningModule` `hparams` setter ([#6207](https://github.com/PyTorchLightning/pytorch-lightning/pull/6207))
64+
65+
5266
### Fixed
5367

5468
- Made the `Plugin.reduce` method more consistent across all Plugins to reflect a mean-reduction by default ([#6011](https://github.com/PyTorchLightning/pytorch-lightning/pull/6011))
@@ -69,6 +83,30 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
6983
- Fixed multiple early stopping callbacks ([#6197](https://github.com/PyTorchLightning/pytorch-lightning/pull/6197))
7084

7185

86+
- Fixed `ModelPruning(make_pruning_permanent=True)` pruning buffers getting removed when saved during training ([#6073](https://github.com/PyTorchLightning/pytorch-lightning/pull/6073))
87+
88+
89+
- Fixed incorrect usage of `detach()`, `cpu()`, `to()` ([#6216](https://github.com/PyTorchLightning/pytorch-lightning/pull/6216))
90+
91+
92+
- Fixed LBFGS optimizer support which didn't converge in automatic optimization ([#6147](https://github.com/PyTorchLightning/pytorch-lightning/pull/6147))
93+
94+
95+
- Prevent `WandbLogger` from dropping values ([#5931](https://github.com/PyTorchLightning/pytorch-lightning/pull/5931))
96+
97+
98+
- Fixed `trainer.test` from `best_path` hangs after calling `trainer.fit` ([#6272](https://github.com/PyTorchLightning/pytorch-lightning/pull/6272))
99+
100+
101+
- Fixed duplicate logs appearing in console when using the python logging module ([#5509](https://github.com/PyTorchLightning/pytorch-lightning/pull/5509), [#6275](https://github.com/PyTorchLightning/pytorch-lightning/pull/6275))
102+
103+
104+
- Fixed `SingleTPU` calling `all_gather` ([#6296](https://github.com/PyTorchLightning/pytorch-lightning/pull/6296))
105+
106+
107+
- Fixed error thrown when using valid distributed mode in multi node ([#6297](https://github.com/PyTorchLightning/pytorch-lightning/pull/6297)
108+
109+
72110
## [1.2.1] - 2021-02-23
73111

74112
### Fixed

azure-pipelines.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
# how much time to give 'run always even if cancelled tasks' before stopping them
2424
cancelTimeoutInMinutes: 2
2525

26-
pool: dsvm-spot-pool
26+
pool: gridai-spot-pool
2727

2828
#strategy:
2929
# matrix:
@@ -58,7 +58,7 @@ jobs:
5858
export GIT_TERMINAL_PROMPT=1
5959
#sudo apt-get install -y cmake
6060
# python -m pip install "pip==20.1"
61-
pip install --requirement requirements.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
61+
pip install --requirement requirements.txt
6262
python -c "fname = 'requirements/extra.txt' ; lines = [line for line in open(fname).readlines() if 'fairscale' not in line] ; open(fname, 'w').writelines(lines)"
6363
python -c "fname = 'requirements/extra.txt' ; lines = [line for line in open(fname).readlines() if 'horovod' not in line] ; open(fname, 'w').writelines(lines)"
6464
pip install --requirement ./requirements/devel.txt --upgrade-strategy only-if-needed

docs/source/common/lightning_module.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -946,7 +946,7 @@ When set to ``False``, Lightning does not automate the optimization process. Thi
946946
opt = self.optimizers(use_pl_optimizer=True)
947947
948948
loss = ...
949-
self.manual_backward(loss, opt)
949+
self.manual_backward(loss)
950950
opt.step()
951951
opt.zero_grad()
952952
@@ -961,16 +961,16 @@ In the multi-optimizer case, ignore the ``optimizer_idx`` argument and use the o
961961
962962
def training_step(self, batch, batch_idx, optimizer_idx):
963963
# access your optimizers with use_pl_optimizer=False. Default is True
964-
(opt_a, opt_b) = self.optimizers(use_pl_optimizer=True)
964+
opt_a, opt_b = self.optimizers(use_pl_optimizer=True)
965965
966966
gen_loss = ...
967967
opt_a.zero_grad()
968-
self.manual_backward(gen_loss, opt_a)
968+
self.manual_backward(gen_loss)
969969
opt_a.step()
970970
971971
disc_loss = ...
972972
opt_b.zero_grad()
973-
self.manual_backward(disc_loss, opt_b)
973+
self.manual_backward(disc_loss)
974974
opt_b.step()
975975
976976
--------------

docs/source/common/optimizers.rst

Lines changed: 39 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -23,32 +23,31 @@ to manually manage the optimization process. To do so, do the following:
2323

2424
* Override your LightningModule ``automatic_optimization`` property to return ``False``
2525
* Drop or ignore the optimizer_idx argument
26-
* Use `self.manual_backward(loss)` instead of `loss.backward()`.
26+
* Use ``self.manual_backward(loss)`` instead of ``loss.backward()``.
2727

28-
.. note:: This is only recommended for experts who need ultimate flexibility. Lightning will handle only precision and accelerators logic. The users are left with zero_grad, accumulated_grad_batches, model toggling, etc..
28+
.. note:: This is only recommended for experts who need ultimate flexibility. Lightning will handle only precision and accelerators logic. The users are left with ``optimizer.zero_grad()``, gradient accumulation, model toggling, etc..
2929

30-
.. warning:: Before 1.2, ``optimzer.step`` was calling ``zero_grad`` internally. From 1.2, it is left to the users expertize.
30+
.. warning:: Before 1.2, ``optimzer.step`` was calling ``optimizer.zero_grad()`` internally. From 1.2, it is left to the users expertize.
3131

3232
.. tip:: To perform ``accumulate_grad_batches`` with one optimizer, you can do as such.
3333

3434
.. tip:: ``self.optimizers()`` will return ``LightningOptimizer`` objects. You can access your own optimizer with ``optimizer.optimizer``. However, if you use your own optimizer to perform a step, Lightning won't be able to support accelerators and precision for you.
3535

36-
3736
.. code-block:: python
3837
3938
def training_step(batch, batch_idx, optimizer_idx):
4039
opt = self.optimizers()
4140
4241
loss = self.compute_loss(batch)
4342
self.manual_backward(loss)
44-
opt.step()
4543
4644
# accumulate gradient batches
4745
if batch_idx % 2 == 0:
46+
opt.step()
4847
opt.zero_grad()
4948
5049
51-
.. tip:: It is a good practice to provide the optimizer with a ``closure`` function that performs a ``forward`` and ``backward`` pass of your model. It is optional for most optimizers, but makes your code compatible if you switch to an optimizer which requires a closure.
50+
.. tip:: It is a good practice to provide the optimizer with a ``closure`` function that performs a ``forward`` and ``backward`` pass of your model. It is optional for most optimizers, but makes your code compatible if you switch to an optimizer which requires a closure. See also `the PyTorch docs <https://pytorch.org/docs/stable/optim.html#optimizer-step-closure>`_.
5251

5352
Here is the same example as above using a ``closure``.
5453

@@ -71,7 +70,6 @@ Here is the same example as above using a ``closure``.
7170
.. code-block:: python
7271
7372
# Scenario for a GAN.
74-
7573
def training_step(...):
7674
opt_gen, opt_dis = self.optimizers()
7775
@@ -137,8 +135,12 @@ Here is an example on how to use it:
137135

138136
Automatic optimization
139137
======================
140-
With Lightning most users don't have to think about when to call .backward(), .step(), .zero_grad(), since
141-
Lightning automates that for you.
138+
With Lightning most users don't have to think about when to call ``.zero_grad()``, ``.backward()`` and ``.step()``
139+
since Lightning automates that for you.
140+
141+
.. warning::
142+
Before 1.2.2, ``.zero_grad()`` was called after ``.backward()`` and ``.step()`` internally.
143+
From 1.2.2, Lightning calls ``.zero_grad()`` before ``.backward()``.
142144

143145
Under the hood Lightning does the following:
144146

@@ -147,33 +149,33 @@ Under the hood Lightning does the following:
147149
for epoch in epochs:
148150
for batch in data:
149151
loss = model.training_step(batch, batch_idx, ...)
152+
optimizer.zero_grad()
150153
loss.backward()
151154
optimizer.step()
152-
optimizer.zero_grad()
153155
154-
for scheduler in schedulers:
155-
scheduler.step()
156+
for lr_scheduler in lr_schedulers:
157+
lr_scheduler.step()
156158
157159
In the case of multiple optimizers, Lightning does the following:
158160

159161
.. code-block:: python
160162
161163
for epoch in epochs:
162-
for batch in data:
163-
for opt in optimizers:
164-
disable_grads_for_other_optimizers()
165-
train_step(opt)
166-
opt.step()
164+
for batch in data:
165+
for opt in optimizers:
166+
loss = model.training_step(batch, batch_idx, optimizer_idx)
167+
opt.zero_grad()
168+
loss.backward()
169+
opt.step()
167170
168-
for scheduler in schedulers:
169-
scheduler.step()
171+
for lr_scheduler in lr_schedulers:
172+
lr_scheduler.step()
170173
171174
172175
Learning rate scheduling
173176
------------------------
174-
Every optimizer you use can be paired with any `LearningRateScheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`_.
175-
In the basic use-case, the scheduler (or multiple schedulers) should be returned as the second output from the ``.configure_optimizers``
176-
method:
177+
Every optimizer you use can be paired with any `Learning Rate Scheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`_.
178+
In the basic use-case, the scheduler (or multiple schedulers) should be returned as the second output from the ``.configure_optimizers`` method:
177179

178180
.. testcode::
179181

@@ -262,7 +264,7 @@ returned as a dict which can contain the following keywords:
262264

263265
Use multiple optimizers (like GANs)
264266
-----------------------------------
265-
To use multiple optimizers return > 1 optimizers from :meth:`pytorch_lightning.core.LightningModule.configure_optimizers`
267+
To use multiple optimizers return two or more optimizers from :meth:`pytorch_lightning.core.LightningModule.configure_optimizers`
266268

267269
.. testcode::
268270

@@ -283,13 +285,15 @@ Lightning will call each optimizer sequentially:
283285
.. code-block:: python
284286
285287
for epoch in epochs:
286-
for batch in data:
287-
for opt in optimizers:
288-
train_step(opt)
289-
opt.step()
288+
for batch in data:
289+
for opt in optimizers:
290+
loss = train_step(batch, batch_idx, optimizer_idx)
291+
opt.zero_grad()
292+
loss.backward()
293+
opt.step()
290294
291-
for scheduler in schedulers:
292-
scheduler.step()
295+
for lr_scheduler in lr_schedulers:
296+
lr_scheduler.step()
293297
294298
----------
295299

@@ -332,7 +336,7 @@ Here we add a learning-rate warm up
332336
# update params
333337
optimizer.step(closure=closure)
334338

335-
.. note:: The default ``optimizer_step`` is relying on the internal ``LightningOptimizer`` to properly perform a step. It handles TPUs, AMP, accumulate_grad_batches, zero_grad, and much more ...
339+
.. note:: The default ``optimizer_step`` is relying on the internal ``LightningOptimizer`` to properly perform a step. It handles TPUs, AMP, accumulate_grad_batches and much more ...
336340

337341
.. testcode::
338342

@@ -362,6 +366,11 @@ Using the closure functions for optimization
362366

363367
When using optimization schemes such as LBFGS, the `second_order_closure` needs to be enabled. By default, this function is defined by wrapping the `training_step` and the backward steps as follows
364368

369+
.. warning::
370+
Before 1.2.2, ``.zero_grad()`` was called outside the closure internally.
371+
From 1.2.2, the closure calls ``.zero_grad()`` inside, so there is no need to define your own closure
372+
when using similar optimizers to :class:`torch.optim.LBFGS` which requires reevaluation of the loss with the closure in ``optimizer.step()``.
373+
365374
.. testcode::
366375

367376
def second_order_closure(pl_module, split_batch, batch_idx, opt_idx, optimizer, hidden):

docs/source/common/trainer.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -254,10 +254,10 @@ You can also modify hardware behavior by subclassing an existing accelerator to
254254
255255
Example::
256256
257-
class MyOwnDDP(DDPAccelerator):
257+
class MyOwnAcc(Accelerator):
258258
...
259259
260-
Trainer(accelerator=MyOwnDDP())
260+
Trainer(accelerator=MyOwnAcc())
261261
262262
.. warning:: Passing in custom accelerators is experimental but work is in progress to enable full compatibility.
263263

docs/source/extensions/logging.rst

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -259,13 +259,19 @@ Configure console logging
259259
*************************
260260

261261
Lightning logs useful information about the training process and user warnings to the console.
262-
You can retrieve the Lightning logger and change it to your liking. For example, increase the logging level
263-
to see fewer messages like so:
262+
You can retrieve the Lightning logger and change it to your liking. For example, adjust the logging level
263+
or redirect output for certain modules to log files:
264264

265-
.. code-block:: python
265+
.. testcode::
266266

267267
import logging
268-
logging.getLogger("lightning").setLevel(logging.ERROR)
268+
269+
# configure logging at the root level of lightning
270+
logging.getLogger("pytorch_lightning").setLevel(logging.ERROR)
271+
272+
# configure logging on module level, redirect to file
273+
logger = logging.getLogger("pytorch_lightning.core")
274+
logger.addHandler(logging.FileHandler("core.log"))
269275

270276
Read more about custom Python logging `here <https://docs.python.org/3/library/logging.html>`_.
271277

docs/source/starter/introduction_guide.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -361,9 +361,9 @@ The training step is what happens inside the training loop.
361361
# TRAINING STEP
362362
# ....
363363
# TRAINING STEP
364+
optimizer.zero_grad()
364365
loss.backward()
365366
optimizer.step()
366-
optimizer.zero_grad()
367367
368368
In the case of MNIST, we do the following
369369

@@ -377,9 +377,9 @@ In the case of MNIST, we do the following
377377
loss = F.nll_loss(logits, y)
378378
# ------ TRAINING STEP END ------
379379
380+
optimizer.zero_grad()
380381
loss.backward()
381382
optimizer.step()
382-
optimizer.zero_grad()
383383
384384
In Lightning, everything that is in the training step gets organized under the
385385
:func:`~pytorch_lightning.core.LightningModule.training_step` function in the LightningModule.

docs/source/starter/new-project.rst

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -248,7 +248,7 @@ as long as you return a loss with an attached graph from the `training_step`, Li
248248
.. code-block:: python
249249
250250
def training_step(self, batch, batch_idx):
251-
loss = self.encoder(batch[0])
251+
loss = self.encoder(batch)
252252
return loss
253253
254254
.. _manual_opt:
@@ -267,19 +267,18 @@ Turn off automatic optimization and you control the train loop!
267267
268268
def training_step(self, batch, batch_idx, optimizer_idx):
269269
# access your optimizers with use_pl_optimizer=False. Default is True
270-
(opt_a, opt_b, opt_c) = self.optimizers(use_pl_optimizer=True)
270+
opt_a, opt_b = self.optimizers(use_pl_optimizer=True)
271271
272-
loss_a = self.generator(batch[0])
273-
274-
# use this instead of loss.backward so we can automate half precision, etc...
275-
self.manual_backward(loss_a, opt_a, retain_graph=True)
276-
self.manual_backward(loss_a, opt_a)
277-
opt_a.step()
272+
loss_a = self.generator(batch)
278273
opt_a.zero_grad()
274+
# use `manual_backward()` instead of `loss.backward` to automate half precision, etc...
275+
self.manual_backward(loss_a)
276+
opt_a.step()
279277
280-
loss_b = self.discriminator(batch[0])
281-
self.manual_backward(loss_b, opt_b)
282-
...
278+
loss_b = self.discriminator(batch)
279+
opt_b.zero_grad()
280+
self.manual_backward(loss_b)
281+
opt_b.step()
283282
284283
285284
Predict or Deploy

notebooks/05-trainer-flags-overview.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2545,7 +2545,7 @@
25452545
"id": "7TAIerPYe_Q1"
25462546
},
25472547
"source": [
2548-
"The EarlyStopping callback runs at the end of every validation epoch, which, under the default configuration, happens after every training epoch. However, the frequency of validation can be modified by setting various parameters on the Trainer, for example check_val_every_n_epoch and val_check_interval. It must be noted that the patience parameter counts the number of validation epochs with no improvement, and not the number of training epochs. Therefore, with parameters check_val_every_n_epoch=10 and patience=3, the trainer will perform at least 40 training epochs before being stopped."
2548+
"The EarlyStopping callback runs at the end of every validation check, which, under the default configuration, happens after every training epoch. However, the frequency of validation can be modified by setting various parameters on the Trainer, for example check_val_every_n_epoch and val_check_interval. It must be noted that the patience parameter counts the number of validation checks with no improvement, and not the number of training epochs. Therefore, with parameters check_val_every_n_epoch=10 and patience=3, the trainer will perform at least 40 training epochs before being stopped."
25492549
]
25502550
},
25512551
{

0 commit comments

Comments
 (0)