Skip to content

Commit cdec83b

Browse files
committed
Merge branch 'master' into feature/trainer-validate-2
2 parents e423b98 + 615b2f7 commit cdec83b

24 files changed

+209
-167
lines changed

CHANGELOG.md

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -104,34 +104,42 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
104104
- Expose DeepSpeed loss parameters to allow users to fix loss instability ([#6115](https://github.com/PyTorchLightning/pytorch-lightning/pull/6115))
105105

106106

107-
- Fixed `AttributeError` when `logger=None` on TPU ([#6221](https://github.com/PyTorchLightning/pytorch-lightning/pull/6221))
107+
- Fixed duplicate logs appearing in console when using the python logging module ([#5509](https://github.com/PyTorchLightning/pytorch-lightning/pull/5509), [#6275](https://github.com/PyTorchLightning/pytorch-lightning/pull/6275))
108108

109109

110-
- Fixed `ModelPruning(make_pruning_permanent=True)` pruning buffers getting removed when saved during training ([#6073](https://github.com/PyTorchLightning/pytorch-lightning/pull/6073))
110+
- Fixed DP reduction with collection ([#6324](https://github.com/PyTorchLightning/pytorch-lightning/pull/6324))
111111

112112

113-
- Fixed `trainer.test` from `best_path` hangs after calling `trainer.fit` ([#6272](https://github.com/PyTorchLightning/pytorch-lightning/pull/6272))
113+
- Fixed `.teardown(stage='fit')` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))
114114

115115

116-
- Fixed duplicate logs appearing in console when using the python logging module ([#5509](https://github.com/PyTorchLightning/pytorch-lightning/pull/5509), [#6275](https://github.com/PyTorchLightning/pytorch-lightning/pull/6275))
116+
- Fixed `.on_fit_{start,end}()` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))
117117

118118

119-
- Fixed `SingleTPU` calling `all_gather` ([#6296](https://github.com/PyTorchLightning/pytorch-lightning/pull/6296))
119+
- Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size ([#4688](https://github.com/PyTorchLightning/pytorch-lightning/pull/4688))
120120

121121

122-
- Fixed DP reduction with collection ([#6324](https://github.com/PyTorchLightning/pytorch-lightning/pull/6324))
122+
- Fixed logger creating directory structure too early in DDP ([#6380](https://github.com/PyTorchLightning/pytorch-lightning/pull/6380))
123123

124124

125-
- Fixed `.teardown(stage='fit')` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))
126-
127-
128-
- Fixed `.on_fit_{start,end}()` getting called during `trainer.test` ([#6386](https://github.com/PyTorchLightning/pytorch-lightning/pull/6386))
125+
## [1.2.3] - 2021-03-09
129126

127+
### Fixed
130128

129+
- Fixed `ModelPruning(make_pruning_permanent=True)` pruning buffers getting removed when saved during training ([#6073](https://github.com/PyTorchLightning/pytorch-lightning/pull/6073))
130+
- Fixed when `_stable_1d_sort` to work when `n >= N` ([#6177](https://github.com/PyTorchLightning/pytorch-lightning/pull/6177))
131+
- Fixed `AttributeError` when `logger=None` on TPU ([#6221](https://github.com/PyTorchLightning/pytorch-lightning/pull/6221))
131132
- Fixed PyTorch Profiler with `emit_nvtx` ([#6260](https://github.com/PyTorchLightning/pytorch-lightning/pull/6260))
133+
- Fixed `trainer.test` from `best_path` hangs after calling `trainer.fit` ([#6272](https://github.com/PyTorchLightning/pytorch-lightning/pull/6272))
134+
- Fixed `SingleTPU` calling `all_gather` ([#6296](https://github.com/PyTorchLightning/pytorch-lightning/pull/6296))
135+
- Ensure we check deepspeed/sharded in multinode DDP ([#6297](https://github.com/PyTorchLightning/pytorch-lightning/pull/6297)
136+
- Check `LightningOptimizer` doesn't delete optimizer hooks ([#6305](https://github.com/PyTorchLightning/pytorch-lightning/pull/6305)
137+
- Resolve memory leak for evaluation ([#6326](https://github.com/PyTorchLightning/pytorch-lightning/pull/6326)
138+
- Ensure that clip gradients is only called if the value is greater than 0 ([#6330](https://github.com/PyTorchLightning/pytorch-lightning/pull/6330)
139+
- Fixed `Trainer` not resetting `lightning_optimizers` when calling `Trainer.fit()` multiple times ([#6372](https://github.com/PyTorchLightning/pytorch-lightning/pull/6372))
132140

133141

134-
- Fixed `Trainer` not resetting `lightning_optimizers` when calling `Trainer.fit()` multiple times ([#6372](https://github.com/PyTorchLightning/pytorch-lightning/pull/6372))
142+
- Fixed `DummyLogger.log_hyperparams` raising a `TypeError` when running with `fast_dev_run=True` ([#6398](https://github.com/PyTorchLightning/pytorch-lightning/pull/6398))
135143

136144

137145
## [1.2.2] - 2021-03-02

docs/source/advanced/multi_gpu.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -332,7 +332,6 @@ There are cases in which it is NOT possible to use DDP. Examples are:
332332

333333
- Jupyter Notebook, Google COLAB, Kaggle, etc.
334334
- You have a nested script without a root package
335-
- Your script needs to invoke both `.fit` and `.test`, or one of them multiple times
336335

337336
In these situations you should use `dp` or `ddp_spawn` instead.
338337

pytorch_lightning/loggers/base.py

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -279,12 +279,14 @@ def _sanitize_params(params: Dict[str, Any]) -> Dict[str, Any]:
279279
return params
280280

281281
@abstractmethod
282-
def log_hyperparams(self, params: argparse.Namespace):
282+
def log_hyperparams(self, params: argparse.Namespace, *args, **kwargs):
283283
"""
284284
Record hyperparameters.
285285
286286
Args:
287287
params: :class:`~argparse.Namespace` containing the hyperparameters
288+
args: Optional positional arguments, depends on the specific logger being used
289+
kwargs: Optional keywoard arguments, depends on the specific logger being used
288290
"""
289291

290292
def log_graph(self, model: LightningModule, input_array=None) -> None:
@@ -418,41 +420,41 @@ def nop(*args, **kw):
418420
def __getattr__(self, _):
419421
return self.nop
420422

421-
def __getitem__(self, idx):
422-
# enables self.logger[0].experiment.add_image
423-
# and self.logger.experiment[0].add_image(...)
423+
def __getitem__(self, idx) -> "DummyExperiment":
424+
# enables self.logger.experiment[0].add_image(...)
424425
return self
425426

426427

427428
class DummyLogger(LightningLoggerBase):
428-
""" Dummy logger for internal use. Is usefull if we want to disable users
429-
logger for a feature, but still secure that users code can run """
429+
"""
430+
Dummy logger for internal use. It is useful if we want to disable user's
431+
logger for a feature, but still ensure that user code can run
432+
"""
430433

431434
def __init__(self):
432435
super().__init__()
433436
self._experiment = DummyExperiment()
434437

435438
@property
436-
def experiment(self):
439+
def experiment(self) -> DummyExperiment:
437440
return self._experiment
438441

439-
@rank_zero_only
440-
def log_metrics(self, metrics, step):
442+
def log_metrics(self, *args, **kwargs) -> None:
441443
pass
442444

443-
@rank_zero_only
444-
def log_hyperparams(self, params):
445+
def log_hyperparams(self, *args, **kwargs) -> None:
445446
pass
446447

447448
@property
448-
def name(self):
449-
pass
449+
def name(self) -> str:
450+
return ""
450451

451452
@property
452-
def version(self):
453-
pass
453+
def version(self) -> str:
454+
return ""
454455

455-
def __getitem__(self, idx):
456+
def __getitem__(self, idx) -> "DummyLogger":
457+
# enables self.logger[0].experiment.add_image(...)
456458
return self
457459

458460

pytorch_lightning/trainer/trainer.py

Lines changed: 7 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -381,21 +381,6 @@ def __init__(
381381
# Callback system
382382
self.on_init_end()
383383

384-
def setup_trainer(self, model: LightningModule):
385-
"""
386-
Sanity check a few things before starting actual training or testing.
387-
388-
Args:
389-
model: The model to run sanity test on.
390-
"""
391-
392-
# log hyper-parameters
393-
if self.logger is not None:
394-
# save exp to get started (this is where the first experiment logs are written)
395-
self.logger.log_hyperparams(model.hparams_initial)
396-
self.logger.log_graph(model)
397-
self.logger.save()
398-
399384
def fit(
400385
self,
401386
model: LightningModule,
@@ -444,7 +429,6 @@ def fit(
444429
self.call_setup_hook(model)
445430
self.call_hook("on_before_accelerator_backend_setup", model)
446431
self.accelerator.setup(self, model) # note: this sets up self.lightning_module
447-
self.setup_trainer(model)
448432

449433
# ----------------------------
450434
# INSPECT THE CORE LOOPS
@@ -509,6 +493,13 @@ def fit(
509493
def pre_dispatch(self):
510494
self.accelerator.pre_dispatch()
511495

496+
# log hyper-parameters
497+
if self.logger is not None:
498+
# save exp to get started (this is where the first experiment logs are written)
499+
self.logger.log_hyperparams(self.lightning_module.hparams_initial)
500+
self.logger.log_graph(self.lightning_module)
501+
self.logger.save()
502+
512503
def post_dispatch(self):
513504
self.accelerator.post_dispatch()
514505
self.accelerator.teardown()

pytorch_lightning/tuner/lr_finder.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -418,11 +418,11 @@ def on_train_batch_end(self, trainer, pl_module, outputs, batch, batch_idx, data
418418
self.progress_bar.update()
419419

420420
current_loss = trainer.train_loop.running_loss.last().item()
421-
current_step = trainer.global_step + 1 # remove the +1 in 1.0
421+
current_step = trainer.global_step
422422

423423
# Avg loss (loss with momentum) + smoothing
424424
self.avg_loss = self.beta * self.avg_loss + (1 - self.beta) * current_loss
425-
smoothed_loss = self.avg_loss / (1 - self.beta**current_step)
425+
smoothed_loss = self.avg_loss / (1 - self.beta**(current_step + 1))
426426

427427
# Check if we diverging
428428
if self.early_stop_threshold is not None:

tests/accelerators/test_accelerator_connector.py

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
# limitations under the License
1414

1515
import os
16+
from typing import Optional
1617
from unittest import mock
1718

1819
import pytest
@@ -30,6 +31,7 @@
3031
DDPSpawnPlugin,
3132
DDPSpawnShardedPlugin,
3233
DeepSpeedPlugin,
34+
ParallelPlugin,
3335
PrecisionPlugin,
3436
SingleDevicePlugin,
3537
)
@@ -408,10 +410,8 @@ def test_ipython_incompatible_backend_error(*_):
408410
["accelerator", "plugin"],
409411
[('ddp_spawn', 'ddp_sharded'), (None, 'ddp_sharded')],
410412
)
411-
def test_plugin_accelerator_choice(accelerator, plugin):
412-
"""
413-
Ensure that when a plugin and accelerator is passed in, that the plugin takes precedent.
414-
"""
413+
def test_plugin_accelerator_choice(accelerator: Optional[str], plugin: str):
414+
"""Ensure that when a plugin and accelerator is passed in, that the plugin takes precedent."""
415415
trainer = Trainer(accelerator=accelerator, plugins=plugin, num_processes=2)
416416
assert isinstance(trainer.accelerator.training_type_plugin, DDPShardedPlugin)
417417

@@ -428,7 +428,9 @@ def test_plugin_accelerator_choice(accelerator, plugin):
428428
])
429429
@mock.patch('torch.cuda.is_available', return_value=True)
430430
@mock.patch('torch.cuda.device_count', return_value=2)
431-
def test_accelerator_choice_multi_node_gpu(mock_is_available, mock_device_count, accelerator, plugin, tmpdir):
431+
def test_accelerator_choice_multi_node_gpu(
432+
mock_is_available, mock_device_count, tmpdir, accelerator: str, plugin: ParallelPlugin
433+
):
432434
trainer = Trainer(
433435
accelerator=accelerator,
434436
default_root_dir=tmpdir,

tests/callbacks/test_callback_hook_outputs.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919

2020
@pytest.mark.parametrize("single_cb", [False, True])
21-
def test_train_step_no_return(tmpdir, single_cb):
21+
def test_train_step_no_return(tmpdir, single_cb: bool):
2222
"""
2323
Tests that only training_step can be used
2424
"""

tests/callbacks/test_early_stopping.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@
1414
import logging
1515
import os
1616
import pickle
17+
from typing import List, Optional
1718
from unittest import mock
1819

1920
import cloudpickle
@@ -119,7 +120,7 @@ def test_early_stopping_no_extraneous_invocations(tmpdir):
119120
([6, 5, 6, 5, 5, 5], 3, 4),
120121
],
121122
)
122-
def test_early_stopping_patience(tmpdir, loss_values, patience, expected_stop_epoch):
123+
def test_early_stopping_patience(tmpdir, loss_values: list, patience: int, expected_stop_epoch: int):
123124
"""Test to ensure that early stopping is not triggered before patience is exhausted."""
124125

125126
class ModelOverrideValidationReturn(BoringModel):
@@ -142,7 +143,7 @@ def validation_epoch_end(self, outputs):
142143
assert trainer.current_epoch == expected_stop_epoch
143144

144145

145-
@pytest.mark.parametrize('validation_step', ['base', None])
146+
@pytest.mark.parametrize('validation_step_none', [True, False])
146147
@pytest.mark.parametrize(
147148
"loss_values, patience, expected_stop_epoch",
148149
[
@@ -151,7 +152,9 @@ def validation_epoch_end(self, outputs):
151152
([6, 5, 6, 5, 5, 5], 3, 4),
152153
],
153154
)
154-
def test_early_stopping_patience_train(tmpdir, validation_step, loss_values, patience, expected_stop_epoch):
155+
def test_early_stopping_patience_train(
156+
tmpdir, validation_step_none: bool, loss_values: list, patience: int, expected_stop_epoch: int
157+
):
155158
"""Test to ensure that early stopping is not triggered before patience is exhausted."""
156159

157160
class ModelOverrideTrainReturn(BoringModel):
@@ -163,7 +166,7 @@ def training_epoch_end(self, outputs):
163166

164167
model = ModelOverrideTrainReturn()
165168

166-
if validation_step is None:
169+
if validation_step_none:
167170
model.validation_step = None
168171

169172
early_stop_callback = EarlyStopping(monitor="train_loss", patience=patience, verbose=True)
@@ -254,7 +257,7 @@ def validation_epoch_end(self, outputs):
254257

255258

256259
@pytest.mark.parametrize('step_freeze, min_steps, min_epochs', [(5, 1, 1), (5, 1, 3), (3, 15, 1)])
257-
def test_min_steps_override_early_stopping_functionality(tmpdir, step_freeze, min_steps, min_epochs):
260+
def test_min_steps_override_early_stopping_functionality(tmpdir, step_freeze: int, min_steps: int, min_epochs: int):
258261
"""Excepted Behaviour:
259262
IF `min_steps` was set to a higher value than the `trainer.global_step` when `early_stopping` is being triggered,
260263
THEN the trainer should continue until reaching `trainer.global_step` == `min_steps`, and stop.
@@ -386,10 +389,10 @@ def on_train_end(self) -> None:
386389
marks=RunIf(skip_windows=True)),
387390
],
388391
)
389-
def test_multiple_early_stopping_callbacks(callbacks, expected_stop_epoch, accelerator, num_processes, tmpdir):
390-
"""
391-
Ensure when using multiple early stopping callbacks we stop if any signals we should stop.
392-
"""
392+
def test_multiple_early_stopping_callbacks(
393+
tmpdir, callbacks: List[EarlyStopping], expected_stop_epoch: int, accelerator: Optional[str], num_processes: int
394+
):
395+
"""Ensure when using multiple early stopping callbacks we stop if any signals we should stop."""
393396

394397
model = EarlyStoppingModel(expected_stop_epoch)
395398

tests/callbacks/test_lr_monitor.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,10 +51,8 @@ def test_lr_monitor_single_lr(tmpdir):
5151

5252

5353
@pytest.mark.parametrize('opt', ['SGD', 'Adam'])
54-
def test_lr_monitor_single_lr_with_momentum(tmpdir, opt):
55-
"""
56-
Test that learning rates and momentum are extracted and logged for single lr scheduler.
57-
"""
54+
def test_lr_monitor_single_lr_with_momentum(tmpdir, opt: str):
55+
"""Test that learning rates and momentum are extracted and logged for single lr scheduler."""
5856

5957
class LogMomentumModel(BoringModel):
6058

@@ -170,7 +168,7 @@ def test_lr_monitor_no_logger(tmpdir):
170168

171169

172170
@pytest.mark.parametrize("logging_interval", ['step', 'epoch'])
173-
def test_lr_monitor_multi_lrs(tmpdir, logging_interval):
171+
def test_lr_monitor_multi_lrs(tmpdir, logging_interval: str):
174172
""" Test that learning rates are extracted and logged for multi lr schedulers. """
175173
tutils.reset_seed()
176174

0 commit comments

Comments
 (0)