Skip to content

Commit 88ca10d

Browse files
committed
Merge branch 'master' into bugfix/batch-device
2 parents 5ddeaec + d51b0ae commit 88ca10d

File tree

96 files changed

+1813
-407
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

96 files changed

+1813
-407
lines changed

.circleci/config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ jobs:
9191
docker:
9292
- image: circleci/python:3.7
9393
environment:
94-
- XLA_VER: 1.7
94+
- XLA_VER: 1.8
9595
- MAX_CHECKS: 240
9696
- CHECK_SPEEP: 5
9797
steps:

.github/CODEOWNERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@
3636

3737
# Specifics
3838
/pytorch_lightning/trainer/connectors/logger_connector @tchaton @carmocca
39+
/pytorch_lightning/trainer/progress.py @tchaton @awaelchli @carmocca
3940

4041
# Metrics
4142
/pytorch_lightning/metrics/ @SkafteNicki @ananyahjha93 @justusschock

.github/CONTRIBUTING.md

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Welcome to the PyTorch Lightning community! We're building the most advanced research platform on the planet to implement the latest, best practices that the amazing PyTorch team rolls out!
44

5+
If you are new to open source, check out [this blog to get started with your first Open Source contribution](https://devblog.pytorchlightning.ai/quick-contribution-guide-86d977171b3a).
6+
57
## Main Core Value: One less thing to remember
68

79
Simplify the API as much as possible from the user perspective.
@@ -58,13 +60,13 @@ Have a favorite feature from other libraries like fast.ai or transformers? Those
5860

5961
## Contribution Types
6062

61-
We are always looking for help implementing new features or fixing bugs.
63+
We are always open to contributions of new features or bug fixes.
6264

6365
A lot of good work has already been done in project mechanics (requirements.txt, setup.py, pep8, badges, ci, etc...) so we're in a good state there thanks to all the early contributors (even pre-beta release)!
6466

6567
### Bug Fixes:
6668

67-
1. If you find a bug please submit a github issue.
69+
1. If you find a bug please submit a GitHub issue.
6870

6971
- Make sure the title explains the issue.
7072
- Describe your setup, what you are trying to do, expected vs. actual behaviour. Please add configs and code samples.
@@ -79,12 +81,12 @@ A lot of good work has already been done in project mechanics (requirements.txt,
7981

8082
3. Submit a PR!
8183

82-
_**Note**, even if you do not find the solution, sending a PR with a test covering the issue is a valid contribution and we can help you or finish it with you :]_
84+
_**Note**, even if you do not find the solution, sending a PR with a test covering the issue is a valid contribution, and we can help you or finish it with you :]_
8385

8486
### New Features:
8587

86-
1. Submit a github issue - describe what is the motivation of such feature (adding the use case or an example is helpful).
87-
2. Let's discuss to determine the feature scope.
88+
1. Submit a GitHub issue - describe what is the motivation of such feature (adding the use case, or an example is helpful).
89+
2. Determine the feature scope with us.
8890
3. Submit a PR! We recommend test driven approach to adding new features as well:
8991

9092
- Write a test for the functionality you want to add.
@@ -199,7 +201,7 @@ Note: if your computer does not have multi-GPU nor TPU these tests are skipped.
199201
**GitHub Actions:** For convenience, you can also use your own GHActions building which will be triggered with each commit.
200202
This is useful if you do not test against all required dependency versions.
201203

202-
**Docker:** Another option is utilize the [pytorch lightning cuda base docker image](https://hub.docker.com/repository/docker/pytorchlightning/pytorch_lightning/tags?page=1&name=cuda). You can then run:
204+
**Docker:** Another option is to utilize the [pytorch lightning cuda base docker image](https://hub.docker.com/repository/docker/pytorchlightning/pytorch_lightning/tags?page=1&name=cuda). You can then run:
203205

204206
```bash
205207
python -m pytest pytorch_lightning tests pl_examples -v
@@ -230,7 +232,7 @@ We welcome any useful contribution! For your convenience here's a recommended wo
230232
- Make sure all tests are passing.
231233
- Make sure you add a GitHub issue to your PR.
232234
5. Use tags in PR name for following cases:
233-
- **[blocked by #<number>]** if you work is depending on others changes.
235+
- **[blocked by #<number>]** if your work is dependent on other PRs.
234236
- **[wip]** when you start to re-edit your work, mark it so no one will accidentally merge it in meantime.
235237

236238
### Question & Answer

.github/ISSUE_TEMPLATE/bug_report.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,13 +41,14 @@ wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master
4141
python collect_env_details.py
4242
```
4343

44-
- PyTorch Version (e.g., 1.0):
45-
- OS (e.g., Linux):
46-
- How you installed PyTorch (`conda`, `pip`, source):
47-
- Build command you used (if compiling from source):
44+
- PyTorch Lightning Version (e.g., 1.3.0):
45+
- PyTorch Version (e.g., 1.8)
4846
- Python version:
47+
- OS (e.g., Linux):
4948
- CUDA/cuDNN version:
5049
- GPU models and configuration:
50+
- How you installed PyTorch (`conda`, `pip`, source):
51+
- If compiling from source, the output of `torch.__config__.show()`:
5152
- Any other relevant information:
5253

5354
### Additional context

CHANGELOG.md

Lines changed: 46 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
3030
- Added support for checkpointing based on a provided time interval during training ([#7515](https://github.com/PyTorchLightning/pytorch-lightning/pull/7515))
3131

3232

33-
- Added dataclasses for progress tracking (
34-
[#6603](https://github.com/PyTorchLightning/pytorch-lightning/pull/6603),
35-
[#7574](https://github.com/PyTorchLightning/pytorch-lightning/pull/7574))
33+
- Progress tracking
34+
* Added dataclasses for progress tracking ([#6603](https://github.com/PyTorchLightning/pytorch-lightning/pull/6603), [#7574](https://github.com/PyTorchLightning/pytorch-lightning/pull/7574), [#8140](https://github.com/PyTorchLightning/pytorch-lightning/pull/8140))
35+
* Add `{,load_}state_dict` to the progress tracking dataclasses ([#8140](https://github.com/PyTorchLightning/pytorch-lightning/pull/8140))
3636

3737

3838
- Added support for passing a `LightningDataModule` positionally as the second argument to `trainer.{validate,test,predict}` ([#7431](https://github.com/PyTorchLightning/pytorch-lightning/pull/7431))
@@ -84,11 +84,14 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
8484

8585

8686
- Fault-tolerant training
87-
* Add `{,load_}state_dict` to `ResultCollection` ([#7948](https://github.com/PyTorchLightning/pytorch-lightning/pull/7948))
88-
* Checkpoint the loop results ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
87+
* Added `{,load_}state_dict` to `ResultCollection` ([#7948](https://github.com/PyTorchLightning/pytorch-lightning/pull/7948))
88+
* Added `{,load_}state_dict` to `Loops` ([#8197](https://github.com/PyTorchLightning/pytorch-lightning/pull/8197))
8989

9090

91-
- Add `rank_zero_only` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
91+
- Added `rank_zero_only` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
92+
93+
94+
- Added `metric_attribute` to `LightningModule.log` function ([#7966](https://github.com/PyTorchLightning/pytorch-lightning/pull/7966))
9295

9396

9497
- Added a warning if `Trainer(log_every_n_steps)` is a value too high for the training dataloader ([#7734](https://github.com/PyTorchLightning/pytorch-lightning/pull/7734))
@@ -115,9 +118,18 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
115118
- Add support for calling scripts using the module syntax (`python -m package.script`) ([#8073](https://github.com/PyTorchLightning/pytorch-lightning/pull/8073))
116119

117120

121+
- Add support for optimizers and learning rate schedulers to `LightningCLI` ([#8093](https://github.com/PyTorchLightning/pytorch-lightning/pull/8093))
122+
123+
118124
- Add torchelastic check when sanitizing GPUs ([#8095](https://github.com/PyTorchLightning/pytorch-lightning/pull/8095))
119125

120126

127+
- Added XLA Profiler ([#8014](https://github.com/PyTorchLightning/pytorch-lightning/pull/8014))
128+
129+
130+
- Added `max_depth` parameter in `ModelSummary` ([#8062](https://github.com/PyTorchLightning/pytorch-lightning/pull/8062))
131+
132+
121133
### Changed
122134

123135

@@ -220,6 +232,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
220232
- `Trainer(resume_from_checkpoint=...)` now restores the model directly after `LightningModule.setup()`, which is before `LightningModule.configure_sharded_model()` ([#7652](https://github.com/PyTorchLightning/pytorch-lightning/pull/7652))
221233

222234

235+
- Added a mechanism to detect `deadlock` for `DDP` when only 1 process trigger an `Exception`. The mechanism will `kill the processes` when it happens ([#8167](https://github.com/PyTorchLightning/pytorch-lightning/pull/8167))
236+
237+
223238
### Deprecated
224239

225240

@@ -253,9 +268,15 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
253268
- Deprecated the use of `CheckpointConnector.hpc_load()` in favor of `CheckpointConnector.restore()` ([#7652](https://github.com/PyTorchLightning/pytorch-lightning/pull/7652))
254269

255270

271+
- Deprecated `DDPPlugin.task_idx` in favor of `DDPPlugin.local_rank` ([#8203](https://github.com/PyTorchLightning/pytorch-lightning/pull/8203))
272+
273+
256274
- Deprecated the `Trainer.train_loop` property in favor of `Trainer.fit_loop` ([#8025](https://github.com/PyTorchLightning/pytorch-lightning/pull/8025))
257275

258276

277+
- Deprecated `mode` parameter in `ModelSummary` in favor of `max_depth` ([#8062](https://github.com/PyTorchLightning/pytorch-lightning/pull/8062))
278+
279+
259280
### Removed
260281

261282
- Removed `ProfilerConnector` ([#7654](https://github.com/PyTorchLightning/pytorch-lightning/pull/7654))
@@ -285,6 +306,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
285306
### Fixed
286307

287308

309+
- Fixed SWA to also work with `IterableDataset` ([#8172](https://github.com/PyTorchLightning/pytorch-lightning/pull/8172))
310+
288311
- Fixed `lr_scheduler` checkpointed state by calling `update_lr_schedulers` before saving checkpoints ([#7877](https://github.com/PyTorchLightning/pytorch-lightning/pull/7877))
289312

290313

@@ -315,6 +338,23 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
315338
- Fixed a DDP info message that was never shown ([#8111](https://github.com/PyTorchLightning/pytorch-lightning/pull/8111))
316339

317340

341+
- Fixed metrics generated during `validation sanity checking` are cleaned on end ([#8171](https://github.com/PyTorchLightning/pytorch-lightning/pull/8171))
342+
343+
344+
- Fixed a bug where an infinite recursion would be triggered when using the `BaseFinetuning` callback on a model that contains a `ModuleDict` ([#8170](https://github.com/PyTorchLightning/pytorch-lightning/pull/8170))
345+
346+
347+
- Fixed NCCL error when selecting non-consecutive device ids ([#8165](https://github.com/PyTorchLightning/pytorch-lightning/pull/8165))
348+
349+
350+
- Fixed `log_gpu_memory` metrics not being added to `logging` when nothing else is logged ([#8174](https://github.com/PyTorchLightning/pytorch-lightning/pull/8174))
351+
352+
353+
- Fixed a bug where calling `log` with a `Metric` instance would raise an error if it was a nested attribute of the model ([#8181](https://github.com/PyTorchLightning/pytorch-lightning/pull/8181))
354+
355+
356+
- Fixed a bug where using `precision=64` would cause buffers with complex dtype to be cast to real ([#8208](https://github.com/PyTorchLightning/pytorch-lightning/pull/8208))
357+
318358
- Fixes access to `callback_metrics` in ddp_spawn ([#7916](https://github.com/PyTorchLightning/pytorch-lightning/pull/7916))
319359

320360

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -369,7 +369,9 @@ class LitAutoEncoder(pl.LightningModule):
369369

370370
The lightning community is maintained by
371371
- [10+ core contributors](https://pytorch-lightning.readthedocs.io/en/latest/governance.html) who are all a mix of professional engineers, Research Scientists, and Ph.D. students from top AI labs.
372-
- 400+ community contributors.
372+
- 480+ active community contributors.
373+
374+
Want to help us build Lightning and reduce boilerplate for thousands of researchers? [Learn how to make your first contribution here](https://devblog.pytorchlightning.ai/quick-contribution-guide-86d977171b3a)
373375

374376
Lightning is also part of the [PyTorch ecosystem](https://pytorch.org/ecosystem/) which requires projects to have solid testing, documentation and support.
375377

dockers/tpu-tests/tpu_test_cases.jsonnet

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ local tputests = base.BaseTest {
2222
|||
2323
cd pytorch-lightning
2424
coverage run --source=pytorch_lightning -m pytest -v --capture=no \
25+
tests/profiler/test_xla_profiler.py \
2526
pytorch_lightning/utilities/xla_device.py \
2627
tests/accelerators/test_tpu_backend.py \
2728
tests/models/test_tpu.py

docs/source/_templates/layout.html

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
{% extends "!layout.html" %}
2+
<link rel="canonical" href="{{ theme_canonical_url }}{{ pagename }}.html" />
3+
4+
{% block footer %}
5+
{{ super() }}
6+
<script script type="text/javascript">
7+
var collapsedSections = ['Best practices', 'Lightning API', 'Optional extensions', 'Tutorials', 'API References', 'Bolts', 'Examples', 'Common Use Cases', 'Partner Domain Frameworks', 'Community'];
8+
</script>
9+
10+
{% endblock %}

docs/source/_templates/theme_variables.jinja

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,5 +14,7 @@
1414
'blog': 'https://www.pytorchlightning.ai/blog',
1515
'resources': 'https://pytorch-lightning.readthedocs.io/en/latest/#community-examples',
1616
'support': 'https://pytorch-lightning.rtfd.io/en/latest/',
17+
'community': 'https://pytorch-lightning.slack.com',
18+
'forums': 'https://pytorch-lightning.slack.com',
1719
}
1820
-%}

docs/source/common/lightning_cli.rst

Lines changed: 117 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
.. testsetup:: *
22
:skipif: not _JSONARGPARSE_AVAILABLE
33

4+
import torch
45
from unittest import mock
56
from typing import List
67
from pytorch_lightning.core.lightning import LightningModule
@@ -385,7 +386,7 @@ instantiating the trainer class can be found in :code:`self.config['trainer']`.
385386

386387

387388
Configurable callbacks
388-
~~~~~~~~~~~~~~~~~~~~~~
389+
^^^^^^^^^^^^^^^^^^^^^^
389390

390391
As explained previously, any callback can be added by including it in the config via :code:`class_path` and
391392
:code:`init_args` entries. However, there are other cases in which a callback should always be present and be
@@ -417,7 +418,7 @@ To change the configuration of the :code:`EarlyStopping` in the config it would
417418
418419
419420
Argument linking
420-
~~~~~~~~~~~~~~~~
421+
^^^^^^^^^^^^^^^^
421422

422423
Another case in which it might be desired to extend :class:`~pytorch_lightning.utilities.cli.LightningCLI` is that the
423424
model and data module depend on a common parameter. For example in some cases both classes require to know the
@@ -470,3 +471,117 @@ Instantiation links are used to automatically determine the order of instantiati
470471
The linking of arguments can be used for more complex cases. For example to derive a value via a function that takes
471472
multiple settings as input. For more details have a look at the API of `link_arguments
472473
<https://jsonargparse.readthedocs.io/en/stable/#jsonargparse.core.ArgumentParser.link_arguments>`_.
474+
475+
476+
Optimizers and learning rate schedulers
477+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
478+
479+
Optimizers and learning rate schedulers can also be made configurable. The most common case is when a model only has a
480+
single optimizer and optionally a single learning rate scheduler. In this case the model's
481+
:class:`~pytorch_lightning.core.lightning.LightningModule` could be left without implementing the
482+
:code:`configure_optimizers` method since it is normally always the same and just adds boilerplate. The following code
483+
snippet shows how to implement it:
484+
485+
.. testcode::
486+
487+
import torch
488+
from pytorch_lightning.utilities.cli import LightningCLI
489+
490+
class MyLightningCLI(LightningCLI):
491+
492+
def add_arguments_to_parser(self, parser):
493+
parser.add_optimizer_args(torch.optim.Adam)
494+
parser.add_lr_scheduler_args(torch.optim.lr_scheduler.ExponentialLR)
495+
496+
cli = MyLightningCLI(MyModel)
497+
498+
With this the :code:`configure_optimizers` method is automatically implemented and in the config the :code:`optimizer`
499+
and :code:`lr_scheduler` groups would accept all of the options for the given classes, in this example :code:`Adam` and
500+
:code:`ExponentialLR`. Therefore, the config file would be structured like:
501+
502+
.. code-block:: yaml
503+
504+
optimizer:
505+
lr: 0.01
506+
lr_scheduler:
507+
gamma: 0.2
508+
model:
509+
...
510+
trainer:
511+
...
512+
513+
And any of these arguments could be passed directly through command line. For example:
514+
515+
.. code-block:: bash
516+
517+
$ python train.py --optimizer.lr=0.01 --lr_scheduler.gamma=0.2
518+
519+
There is also the possibility of selecting among multiple classes by giving them as a tuple. For example:
520+
521+
.. testcode::
522+
523+
class MyLightningCLI(LightningCLI):
524+
525+
def add_arguments_to_parser(self, parser):
526+
parser.add_optimizer_args((torch.optim.SGD, torch.optim.Adam))
527+
528+
In this case in the config the :code:`optimizer` group instead of having directly init settings, it should specify
529+
:code:`class_path` and optionally :code:`init_args`. Sub-classes of the classes in the tuple would also be accepted.
530+
A corresponding example of the config file would be:
531+
532+
.. code-block:: yaml
533+
534+
optimizer:
535+
class_path: torch.optim.Adam
536+
init_args:
537+
lr: 0.01
538+
model:
539+
...
540+
trainer:
541+
...
542+
543+
And the same through command line:
544+
545+
.. code-block:: bash
546+
547+
$ python train.py --optimizer='{class_path: torch.optim.Adam, init_args: {lr: 0.01}}'
548+
549+
The automatic implementation of :code:`configure_optimizers` can be disabled by linking the configuration group. An
550+
example can be :code:`ReduceLROnPlateau` which requires to specify a monitor. This would be:
551+
552+
.. testcode::
553+
554+
from pytorch_lightning.utilities.cli import instantiate_class, LightningCLI
555+
556+
class MyModel(LightningModule):
557+
558+
def __init__(self, optimizer_init: dict, lr_scheduler_init: dict):
559+
super().__init__()
560+
self.optimizer_init = optimizer_init
561+
self.lr_scheduler_init = lr_scheduler_init
562+
563+
def configure_optimizers(self):
564+
optimizer = instantiate_class(self.parameters(), self.optimizer_init)
565+
scheduler = instantiate_class(optimizer, self.lr_scheduler_init)
566+
return {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}
567+
568+
class MyLightningCLI(LightningCLI):
569+
570+
def add_arguments_to_parser(self, parser):
571+
parser.add_optimizer_args(
572+
torch.optim.Adam,
573+
link_to='model.optimizer_init',
574+
)
575+
parser.add_lr_scheduler_args(
576+
torch.optim.lr_scheduler.ReduceLROnPlateau,
577+
link_to='model.lr_scheduler_init',
578+
)
579+
580+
cli = MyLightningCLI(MyModel)
581+
582+
For both possibilities of using :meth:`pytorch_lightning.utilities.cli.LightningArgumentParser.add_optimizer_args` with
583+
a single class or a tuple of classes, the value given to :code:`optimizer_init` will always be a dictionary including
584+
:code:`class_path` and :code:`init_args` entries. The function
585+
:func:`~pytorch_lightning.utilities.cli.instantiate_class` takes care of importing the class defined in
586+
:code:`class_path` and instantiating it using some positional arguments, in this case :code:`self.parameters()`, and the
587+
:code:`init_args`. Any number of optimizers and learning rate schedulers can be added when using :code:`link_to`.

0 commit comments

Comments
 (0)