Skip to content

Commit ee47060

Browse files
authored
Merge branch 'master' into bugfix/3827_test_ddp_error
2 parents a8e5cc9 + 4bb3a08 commit ee47060

File tree

116 files changed

+3228
-1704
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

116 files changed

+3228
-1704
lines changed

.drone.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,16 @@ steps:
2525
environment:
2626
CODECOV_TOKEN:
2727
from_secret: codecov_token
28+
AUTH_TOKEN:
29+
from_secret: gh_auth_token
2830
MKL_THREADING_LAYER: GNU
2931

3032
commands:
3133
- python --version
3234
- pip --version
3335
- nvidia-smi
3436
- pip install -r ./requirements/devel.txt --upgrade-strategy only-if-needed -v --no-cache-dir
37+
- pip install git+https://${AUTH_TOKEN}@github.com/PyTorchLightning/[email protected] -v --no-cache-dir
3538
# when Image has defined CUDa version we can switch to this package spec "nvidia-dali-cuda${CUDA_VERSION%%.*}0"
3639
# todo: temprarl fix till https://github.com/PyTorchLightning/pytorch-lightning/pull/4922 is resolved
3740
- pip install --extra-index-url https://developer.download.nvidia.com/compute/redist "nvidia-dali-cuda100<0.27" --upgrade-strategy only-if-needed

.github/CODEOWNERS

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -5,23 +5,48 @@
55
# the repo. Unless a later match takes precedence,
66
# @global-owner1 and @global-owner2 will be requested for
77
# review when someone opens a pull request.
8-
* @williamfalcon @borda @teddykoker @awaelchli @nateraw @justusschock @tchaton @SeanNaren @ananyahjha93
8+
* @williamfalcon @borda @tchaton @SeanNaren @awaelchli @justusschock
99

1010
# Metrics
11-
/pytorch_lightning/metrics/* @teddykoker @ananyahjha93 @justusschock
12-
/tests/metrics/* @teddykoker @ananyahjha93 @justusschock
11+
/pytorch_lightning/metrics/ @teddykoker @ananyahjha93 @justusschock
12+
/tests/metrics/ @teddykoker @ananyahjha93 @justusschock
1313
/docs/source/metrics.rst @teddykoker @ananyahjha93 @justusschock
1414

1515
# API
16-
/pytorch_lightning/callbacks/base.py @williamfalcon
17-
/pytorch_lightning/core/datamodule.py @williamfalcon
18-
/pytorch_lightning/trainer/trainer.py @williamfalcon
19-
/pytorch_lightning/core/hooks.py @williamfalcon
20-
/pytorch_lightning/core/lightning.py @williamfalcon
16+
/pytorch_lightning/callbacks/base.py @williamfalcon
17+
/pytorch_lightning/core/datamodule.py @williamfalcon
18+
/pytorch_lightning/trainer/trainer.py @williamfalcon @tchaton
19+
/pytorch_lightning/core/hooks.py @williamfalcon
20+
/pytorch_lightning/core/lightning.py @williamfalcon @tchaton
21+
/pytorch_lightning/core/optimizer.py @tchaton
22+
/pytorch_lightning/trainer/training_loop.py @tchaton @SeanNaren
23+
/pytorch_lightning/trainer/evaluation_loop.py @tchaton @SeanNaren
2124

25+
# Connectors
26+
/pytorch_lightning/trainer/connectors/ @tchaton @SeanNaren
2227

2328
# accelerators
24-
/pytorch_lightning/accelerators/* @williamfalcon
29+
/pytorch_lightning/accelerators/ @williamfalcon @tchaton @SeanNaren @awaelchli @justusschock
2530

2631
# owners
27-
/pytorch_lightning/.github/CODEOWNERS @williamfalcon
32+
/.github/CODEOWNERS @williamfalcon
33+
# main
34+
/README.md @williamfalcon @edenlightning
35+
# installation
36+
/setup.py @borda @williamfalcon
37+
38+
# CI/CD
39+
/.github/workflows/ @borda @tchaton
40+
/.github/*.py @borda @tchaton
41+
/dockers/ @borda @tchaton
42+
# configs in root
43+
/*.yml @borda @tchaton
44+
45+
# Docs
46+
/docs/ @edenlightning @tchaton @borda @awaelchli
47+
/.github/*.md @edenlightning @williamfalcon @borda
48+
/.github/ISSUE_TEMPLATE/*.md @edenlightning @borda @tchaton
49+
/docs/source/conf.py @borda @awaelchli
50+
51+
# Testing
52+
/tests/base/boring_model.py @williamfalcon

.github/workflows/ci_test-full.yml

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,9 @@ jobs:
1818
python-version: [3.6, 3.7, 3.8]
1919
requires: ['minimal', 'latest']
2020
exclude:
21-
# excludes PT 1.3 as it is missing on pypi
21+
# # todo: segmentation fault for minimal and hanging for latest
2222
- python-version: 3.8
23-
requires: 'minimal'
24-
# TODO: temporary fix till hanging jobs on macOS for py38 is resolved
25-
- python-version: 3.8
26-
os: macOS-10.15
23+
os: ubuntu-18.04
2724

2825
# Timeout: https://stackoverflow.com/a/59076067/4521646
2926
timeout-minutes: 35 # TODO: the macOS is taking too long, probably caching did not work...
@@ -36,7 +33,8 @@ jobs:
3633

3734
- name: Update Pip
3835
run: |
39-
pip install --quiet "pip>=20.1" --upgrade --user # needed for get pip cacher folder
36+
# todo: unfreeze PIP after resolving minimal dependencies
37+
pip install --quiet "pip==20.1" --upgrade --user # needed for get pip cacher folder
4038
4139
# Github Actions: Run step on specific OS: https://stackoverflow.com/a/57948488/4521646
4240
- name: Setup macOS
@@ -52,16 +50,19 @@ jobs:
5250
python -c "fname = 'requirements/extra.txt' ; lines = [line for line in open(fname).readlines() if not line.startswith('horovod')] ; open(fname, 'w').writelines(lines)"
5351
5452
# versions <= 1.3 may have issues on mac with some BLAS ops due to missing mkl (https://github.com/pytorch/pytorch/issues/18996)
55-
- name: Setup MacOS Minimal
56-
if: runner.os == 'macOS' && matrix.requires == 'minimal'
53+
- name: Adjust minimal for Python 3.8 and MacOS
54+
if: matrix.requires == 'minimal' && (runner.os == 'macOS' || matrix.python-version == 3.8)
5755
run : |
5856
python -c "fname = 'requirements.txt' ; req = open(fname).read().replace('torch>=1.3', 'torch>=1.4') ; open(fname, 'w').write(req)"
57+
python -c "fname = 'requirements/examples.txt' ; req = open(fname).read().replace('torchvision>=0.4.1', 'torchvision>=0.5.0') ; open(fname, 'w').write(req)"
58+
python -c "fname = 'requirements/extra.txt' ; req = open(fname).read().replace('torchtext>=0.3.1', 'torchtext>=0.5.0') ; open(fname, 'w').write(req)"
5959
6060
- name: Set min. dependencies
6161
if: matrix.requires == 'minimal'
6262
run: |
6363
python -c "fname = 'requirements.txt' ; req = open(fname).read().replace('>=', '==') ; open(fname, 'w').write(req)"
6464
python -c "fname = 'requirements/extra.txt' ; req = open(fname).read().replace('>=', '==') ; open(fname, 'w').write(req)"
65+
python -c "fname = 'requirements/loggers.txt' ; req = open(fname).read().replace('>=', '==') ; open(fname, 'w').write(req)"
6566
python -c "fname = 'requirements/test.txt' ; req = open(fname).read().replace('>=', '==') ; open(fname, 'w').write(req)"
6667
python -c "fname = 'requirements/examples.txt' ; req = open(fname).read().replace('>=', '==') ; open(fname, 'w').write(req)"
6768
# remove Fairscale from requirements

CHANGELOG.md

Lines changed: 76 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,14 @@ All notable changes to this project will be documented in this file.
44

55
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
66

7+
## Unreleased
8+
9+
### Fixed
10+
11+
- Fixed `LoggerConnector` to have logged metrics on root device in DP ([#4138](https://github.com/PyTorchLightning/pytorch-lightning/pull/4138))
712

8-
## [unreleased.Features] - YYYY-MM-DD
13+
14+
## [1.1.0rc] - 2020-12-02
915

1016
### Added
1117

@@ -30,10 +36,12 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
3036
- Added `current_score` to `ModelCheckpoint.on_save_checkpoint` ([#4721](https://github.com/PyTorchLightning/pytorch-lightning/pull/4721))
3137

3238

33-
- Added logging using `self.log` in train and evaluation for most callbacks and model hooks (
39+
- Added logging using `self.log` in train and evaluation for epoch end hooks (
3440
[#4552](https://github.com/PyTorchLightning/pytorch-lightning/pull/4552),
3541
[#4495](https://github.com/PyTorchLightning/pytorch-lightning/pull/4495),
3642
[#4439](https://github.com/PyTorchLightning/pytorch-lightning/pull/4439))
43+
[#4684](https://github.com/PyTorchLightning/pytorch-lightning/pull/4684))
44+
[#4913](https://github.com/PyTorchLightning/pytorch-lightning/pull/4913))
3745

3846

3947
- Added ability for DDP plugin to modify optimizer state saving ([#4675](https://github.com/PyTorchLightning/pytorch-lightning/pull/4675))
@@ -44,27 +52,75 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
4452

4553
- Added printing of total num of params, trainable and non-trainable params in ModelSummary ([#4521](https://github.com/PyTorchLightning/pytorch-lightning/pull/4521))
4654

55+
4756
- Added optimizer refactors ([#4658](https://github.com/PyTorchLightning/pytorch-lightning/pull/4658))
4857

4958

59+
- Added `PrecisionRecallCurve, ROC, AveragePrecision` class metric ([#4549](https://github.com/PyTorchLightning/pytorch-lightning/pull/4549))
60+
61+
62+
- Added custom `Apex` and `NativeAMP` as `Precision plugins` ([#4355](https://github.com/PyTorchLightning/pytorch-lightning/pull/4355))
63+
64+
65+
- Added `DALI MNIST` example ([#3721](https://github.com/PyTorchLightning/pytorch-lightning/pull/3721))
66+
67+
68+
- Added `sharded plugin` for DDP for multi-gpu training memory optimizations (
69+
[#4639](https://github.com/PyTorchLightning/pytorch-lightning/pull/4639),
70+
[#4686](https://github.com/PyTorchLightning/pytorch-lightning/pull/4686),
71+
[#4675](https://github.com/PyTorchLightning/pytorch-lightning/pull/4675),
72+
[#4737](https://github.com/PyTorchLightning/pytorch-lightning/pull/4737),
73+
[#4773](https://github.com/PyTorchLightning/pytorch-lightning/pull/4773))
74+
75+
76+
- Added `experiment_id` to the NeptuneLogger ([#3462](https://github.com/PyTorchLightning/pytorch-lightning/pull/3462))
77+
78+
79+
- Added `Pytorch Geometric` integration example with Lightning ([#4568](https://github.com/PyTorchLightning/pytorch-lightning/pull/4568))
80+
81+
5082
### Changed
5183

84+
- Removed `multiclass_roc` and `multiclass_precision_recall_curve`, use `roc` and `precision_recall_curve` instead ([#4549](https://github.com/PyTorchLightning/pytorch-lightning/pull/4549))
85+
86+
87+
5288
- Tuner algorithms will be skipped if `fast_dev_run=True` ([#3903](https://github.com/PyTorchLightning/pytorch-lightning/pull/3903))
5389

90+
91+
5492
- WandbLogger does not force wandb `reinit` arg to True anymore and creates a run only when needed ([#4648](https://github.com/PyTorchLightning/pytorch-lightning/pull/4648))
5593

5694

95+
- Changed `automatic_optimization` to be a model attribute ([#4602](https://github.com/PyTorchLightning/pytorch-lightning/pull/4602))
96+
97+
98+
- Changed `Simple Profiler` report to order by percentage time spent + num calls ([#4880](https://github.com/PyTorchLightning/pytorch-lightning/pull/4880))
99+
100+
57101
### Deprecated
58102

59103
- Deprecated `prefix` argument in `ModelCheckpoint` ([#4765](https://github.com/PyTorchLightning/pytorch-lightning/pull/4765))
60104

61105

106+
- Deprecated the old way of assigning hyper-parameters through `self.hparams = ...` ([#4813](https://github.com/PyTorchLightning/pytorch-lightning/pull/4813))
107+
108+
109+
- Deprecated `mode='auto'` from `ModelCheckpoint` and `EarlyStopping` ([#4695](https://github.com/PyTorchLightning/pytorch-lightning/pull/4695))
110+
111+
62112
### Removed
63113

64114

65115

66116
### Fixed
67117

118+
- Added feature to move tensors to CPU before saving ([#4309](https://github.com/PyTorchLightning/pytorch-lightning/pull/4309))
119+
120+
- Fixed `LoggerConnector` to have logged metrics on root device in DP ([#4138](https://github.com/PyTorchLightning/pytorch-lightning/pull/4138))
121+
122+
123+
- Auto convert tensors to contiguous format when `gather_all` ([#4907](https://github.com/PyTorchLightning/pytorch-lightning/pull/4907))
68124

69125

70126
## [1.0.8] - 2020-11-24
@@ -82,6 +138,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
82138
- Renamed class metric `Fbeta` >> `FBeta` ([#4656](https://github.com/PyTorchLightning/pytorch-lightning/pull/4656))
83139
- Model summary: add 1 decimal place ([#4745](https://github.com/PyTorchLightning/pytorch-lightning/pull/4745))
84140
- Do not override `PYTHONWARNINGS` ([#4700](https://github.com/PyTorchLightning/pytorch-lightning/pull/4700))
141+
- Changed `init_ddp_connection` moved from `DDP` to `DDPPlugin` ([#4407](https://github.com/PyTorchLightning/pytorch-lightning/pull/4407))
142+
85143

86144
### Fixed
87145

@@ -122,6 +180,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
122180
- Added `manual_optimizer_step` which work with `AMP Native` and `accumulated_grad_batches` ([#4485](https://github.com/PyTorchLightning/pytorch-lightning/pull/4485))
123181
- Added `persistent(mode)` method to metrics, to enable and disable metric states being added to `state_dict` ([#4482](https://github.com/PyTorchLightning/pytorch-lightning/pull/4482))
124182
- Added congratulations at the end of our notebooks ([#4555](https://github.com/PyTorchLightning/pytorch-lightning/pull/4555))
183+
- Added parameters `move_metrics_to_cpu` in Trainer to disable gpu leak ([#4592](https://github.com/PyTorchLightning/pytorch-lightning/pull/4592))
184+
125185

126186
### Changed
127187

@@ -141,7 +201,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
141201
- Replace `MisconfigurationException` with warning in `ModelCheckpoint` Callback ([#4560](https://github.com/PyTorchLightning/pytorch-lightning/pull/4560))
142202
- Fixed logged keys in mlflow logger ([#4412](https://github.com/PyTorchLightning/pytorch-lightning/pull/4412))
143203
- Fixed `is_picklable` by catching `AttributeError` ([#4508](https://github.com/PyTorchLightning/pytorch-lightning/pull/4508))
144-
204+
- Fixed multi test dataloaders dict `AttributeError` error ([#4480](https://github.com/PyTorchLightning/pytorch-lightning/pull/4480))
205+
- Fixed show progress bar only for `progress_rank 0` on `DDP_SLURM` ([#4437](https://github.com/PyTorchLightning/pytorch-lightning/pull/4437))
145206

146207
## [1.0.5] - 2020-11-03
147208

@@ -156,6 +217,7 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
156217
- Hook `on_after_backward` is called only when `optimizer_step` is being called ([#4439](https://github.com/PyTorchLightning/pytorch-lightning/pull/4439))
157218
- Moved `track_and_norm_grad` into `training loop` and called only when `optimizer_step` is being called ([#4439](https://github.com/PyTorchLightning/pytorch-lightning/pull/4439))
158219
- Changed type checker with explicit cast of `ref_model` object ([#4457](https://github.com/PyTorchLightning/pytorch-lightning/pull/4457))
220+
- Changed `distributed_backend` -> `accelerator` ([#4429](https://github.com/PyTorchLightning/pytorch-lightning/pull/4429))
159221

160222
### Deprecated
161223

@@ -172,6 +234,9 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
172234
- Fixed TorchScript trace method's data to device and docstring ([#4360](https://github.com/PyTorchLightning/pytorch-lightning/pull/4360))
173235
- Fixed CSV logger warning ([#4419](https://github.com/PyTorchLightning/pytorch-lightning/pull/4419))
174236
- Fixed skip DDP parameter sync ([#4301](https://github.com/PyTorchLightning/pytorch-lightning/pull/4301))
237+
- Fixed `WandbLogger` _sanitize_callable function ([#4422](https://github.com/PyTorchLightning/pytorch-lightning/pull/4422))
238+
- Fixed `AMP Native` `_unscale` gradient ([#4441](https://github.com/PyTorchLightning/pytorch-lightning/pull/4441))
239+
175240

176241
## [1.0.4] - 2020-10-27
177242

@@ -183,6 +248,10 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
183248
- Added `fsspec` support for profilers ([#4162](https://github.com/PyTorchLightning/pytorch-lightning/pull/4162))
184249
- Added autogenerated helptext to `Trainer.add_argparse_args` ([#4344](https://github.com/PyTorchLightning/pytorch-lightning/pull/4344))
185250
- Added support for string values in `Trainer`'s `profiler` parameter ([#3656](https://github.com/PyTorchLightning/pytorch-lightning/pull/3656))
251+
- Added support for string values in `Trainer`'s `profiler` parameter ([#3656](https://github.com/PyTorchLightning/pytorch-lightning/pull/3656))
252+
- Added `optimizer_closure` to `optimizer.step` when supported ([#4190](https://github.com/PyTorchLightning/pytorch-lightning/pull/4190))
253+
- Added unification of regression metrics ([#4166](https://github.com/PyTorchLightning/pytorch-lightning/pull/4166))
254+
- Added checkpoint load from Bytes ([#4314](https://github.com/PyTorchLightning/pytorch-lightning/pull/4314))
186255

187256
### Changed
188257

@@ -202,6 +271,10 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
202271
- Fixed setting device ids in DDP ([#4297](https://github.com/PyTorchLightning/pytorch-lightning/pull/4297))
203272
- Fixed synchronization of best model path in `ddp_accelerator` ([#4323](https://github.com/PyTorchLightning/pytorch-lightning/pull/4323))
204273
- Fixed `WandbLogger` not uploading checkpoint artifacts at the end of training ([#4341](https://github.com/PyTorchLightning/pytorch-lightning/pull/4341))
274+
- Fixed `FBeta` computation ([#4183](https://github.com/PyTorchLightning/pytorch-lightning/pull/4183))
275+
- Fixed `accumulation across batches` has completed `before breaking training loop` ([#4278](https://github.com/PyTorchLightning/pytorch-lightning/pull/4278))
276+
- Fixed `ModelCheckpoint` don't increase current_epoch and global_step when not training ([#4291](https://github.com/PyTorchLightning/pytorch-lightning/pull/4291))
277+
- Fixed `COMET_EXPERIMENT_KEY` environment variable usage in comet logger ([#4230](https://github.com/PyTorchLightning/pytorch-lightning/pull/4230))
205278

206279
## [1.0.3] - 2020-10-20
207280

MANIFEST.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ exclude tests
4242
recursive-exclude docs *
4343
exclude docs
4444
recursive-include docs/source/_images/logos/ *
45+
recursive-include docs/source/_images/badges/ *
4546
recursive-include docs/source/_images/general/ pl_overview* tf_* tutorial_* PTL101_*
4647

4748
# Include the Requirements

0 commit comments

Comments
 (0)