Skip to content

Commit 4cc7223

Browse files
rohitgr7Bordaawaelchli
authored
Fix num batches in case of multiple dataloaders and percent_check (#1920)
* git conflict Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: Adrian Wälchli <[email protected]>
1 parent db84ca9 commit 4cc7223

File tree

15 files changed

+81
-78
lines changed

15 files changed

+81
-78
lines changed

docs/source/debugging.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Debugging
66
=========
77
The following are flags that make debugging much easier.
88

9-
-----------------
9+
---
1010

1111
fast_dev_run
1212
------------
@@ -21,7 +21,7 @@ argument of :class:`~pytorch_lightning.trainer.trainer.Trainer`)
2121

2222
trainer = Trainer(fast_dev_run=True)
2323

24-
-----------------
24+
---
2525

2626
Inspect gradient norms
2727
----------------------
@@ -35,7 +35,7 @@ argument of :class:`~pytorch_lightning.trainer.trainer.Trainer`)
3535
# the 2-norm
3636
trainer = Trainer(track_grad_norm=2)
3737

38-
-----------------
38+
---
3939

4040
Log GPU usage
4141
-------------
@@ -48,7 +48,7 @@ argument of :class:`~pytorch_lightning.trainer.trainer.Trainer`)
4848

4949
trainer = Trainer(log_gpu_memory=True)
5050

51-
-----------------
51+
---
5252

5353
Make model overfit on subset of data
5454
------------------------------------
@@ -70,7 +70,7 @@ argument of :class:`~pytorch_lightning.trainer.trainer.Trainer`)
7070
With this flag, the train, val, and test sets will all be the same train set. We will also replace the sampler
7171
in the training set to turn off shuffle for you.
7272

73-
-----------------
73+
---
7474

7575
Print a summary of your LightningModule
7676
---------------------------------------
@@ -99,7 +99,7 @@ See Also:
9999
- :paramref:`~pytorch_lightning.trainer.trainer.Trainer.weights_summary` Trainer argument
100100
- :class:`~pytorch_lightning.core.memory.ModelSummary`
101101

102-
-----------------
102+
---
103103

104104
Shorten epochs
105105
--------------
@@ -116,7 +116,7 @@ On larger datasets like Imagenet, this can help you debug or test a few things f
116116
# use 10 batches of train and 5 batches of val
117117
trainer = Trainer(limit_train_batches=10, limit_val_batches=5)
118118

119-
-----------------
119+
---
120120

121121
Set the number of validation sanity steps
122122
-----------------------------------------

docs/source/experiment_logging.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
Experiment Logging
88
==================
99

10-
-------------------
10+
---
1111

1212
Comet.ml
1313
^^^^^^^^
@@ -49,7 +49,7 @@ The :class:`~pytorch_lightning.loggers.CometLogger` is available anywhere except
4949
.. seealso::
5050
:class:`~pytorch_lightning.loggers.CometLogger` docs.
5151

52-
-------------------
52+
---
5353

5454
MLflow
5555
^^^^^^
@@ -76,7 +76,7 @@ Then configure the logger and pass it to the :class:`~pytorch_lightning.trainer.
7676
.. seealso::
7777
:class:`~pytorch_lightning.loggers.MLFlowLogger` docs.
7878

79-
-------------------
79+
---
8080

8181
Neptune.ai
8282
^^^^^^^^^^
@@ -116,7 +116,7 @@ The :class:`~pytorch_lightning.loggers.NeptuneLogger` is available anywhere exce
116116
.. seealso::
117117
:class:`~pytorch_lightning.loggers.NeptuneLogger` docs.
118118

119-
-------------------
119+
---
120120

121121
allegro.ai TRAINS
122122
^^^^^^^^^^^^^^^^^
@@ -160,7 +160,7 @@ The :class:`~pytorch_lightning.loggers.TrainsLogger` is available anywhere in yo
160160
.. seealso::
161161
:class:`~pytorch_lightning.loggers.TrainsLogger` docs.
162162

163-
-------------------
163+
---
164164

165165
Tensorboard
166166
^^^^^^^^^^^
@@ -186,7 +186,7 @@ The :class:`~pytorch_lightning.loggers.TensorBoardLogger` is available anywhere
186186
.. seealso::
187187
:class:`~pytorch_lightning.loggers.TensorBoardLogger` docs.
188188

189-
-------------------
189+
---
190190

191191
Test Tube
192192
^^^^^^^^^
@@ -221,7 +221,7 @@ The :class:`~pytorch_lightning.loggers.TestTubeLogger` is available anywhere exc
221221
.. seealso::
222222
:class:`~pytorch_lightning.loggers.TestTubeLogger` docs.
223223

224-
-------------------
224+
---
225225

226226
Weights and Biases
227227
^^^^^^^^^^^^^^^^^^
@@ -257,7 +257,7 @@ The :class:`~pytorch_lightning.loggers.WandbLogger` is available anywhere except
257257
.. seealso::
258258
:class:`~pytorch_lightning.loggers.WandbLogger` docs.
259259

260-
-------------------
260+
---
261261

262262
Multiple Loggers
263263
^^^^^^^^^^^^^^^^

docs/source/experiment_reporting.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,7 @@ Here we show the validation loss in the progress bar
104104

105105
Snapshot hyperparameters
106106
^^^^^^^^^^^^^^^^^^^^^^^^
107+
107108
When training a model, it's useful to know what hyperparams went into that model.
108109
When Lightning creates a checkpoint, it stores a key "hparams" with the hyperparams.
109110

@@ -118,6 +119,7 @@ in the `hparams tab <https://pytorch.org/docs/stable/tensorboard.html#torch.util
118119

119120
Snapshot code
120121
^^^^^^^^^^^^^
122+
121123
Loggers also allow you to snapshot a copy of the code used in this experiment.
122124
For example, TestTubeLogger does this with a flag:
123125

docs/source/fast_training.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Fast Training
88
There are multiple options to speed up different parts of the training by choosing to train
99
on a subset of data. This could be done for speed or debugging purposes.
1010

11-
----------------------
11+
---
1212

1313
Check validation every n epochs
1414
-------------------------------
@@ -19,7 +19,7 @@ If you have a small dataset you might want to check validation every n epochs
1919
# DEFAULT
2020
trainer = Trainer(check_val_every_n_epoch=1)
2121

22-
----------------------
22+
---
2323

2424
Force training for min or max epochs
2525
------------------------------------
@@ -33,7 +33,7 @@ It can be useful to force training for a minimum number of epochs or limit to a
3333
# DEFAULT
3434
trainer = Trainer(min_epochs=1, max_epochs=1000)
3535

36-
----------------------
36+
---
3737

3838
Set validation check frequency within 1 training epoch
3939
------------------------------------------------------
@@ -52,7 +52,7 @@ Must use an int if using an IterableDataset.
5252
# check every 100 train batches (ie: for IterableDatasets or fixed frequency)
5353
trainer = Trainer(val_check_interval=100)
5454

55-
----------------------
55+
---
5656

5757
Use data subset for training, validation and test
5858
-------------------------------------------------

docs/source/introduction_guide.rst

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ To illustrate, here's the typical PyTorch project structure organized in a Light
1717
As your project grows in complexity with things like 16-bit precision, distributed training, etc... the part in blue
1818
quickly becomes onerous and starts distracting from the core research code.
1919

20-
---------
20+
---
2121

2222
Goal of this guide
2323
------------------
@@ -32,7 +32,7 @@ to use inheritance to very quickly create an AutoEncoder.
3232
.. note:: Any DL/ML PyTorch project fits into the Lightning structure. Here we just focus on 3 types
3333
of research to illustrate.
3434

35-
---------
35+
---
3636

3737
Installing Lightning
3838
--------------------
@@ -55,8 +55,7 @@ Or with conda
5555
5656
conda install pytorch-lightning -c conda-forge
5757
58-
59-
---------
58+
---
6059

6160
Lightning Philosophy
6261
--------------------
@@ -118,7 +117,7 @@ In Lightning this code is abstracted out by `Callbacks`.
118117
generated = decoder(z)
119118
self.experiment.log('images', generated)
120119
121-
---------
120+
---
122121

123122
Elements of a research project
124123
------------------------------
@@ -381,7 +380,7 @@ in the LightningModule
381380
Again, this is the same PyTorch code except that it has been organized by the LightningModule.
382381
This code is not restricted which means it can be as complicated as a full seq-2-seq, RL loop, GAN, etc...
383382

384-
---------
383+
---
385384

386385
Training
387386
--------
@@ -587,11 +586,11 @@ Notice the epoch is MUCH faster!
587586
.. figure:: /_images/mnist_imgs/tpu_fast.png
588587
:alt: TPU speed
589588

590-
---------
589+
---
591590

592591
.. include:: hyperparameters.rst
593592

594-
---------
593+
---
595594

596595
Validating
597596
----------
@@ -670,7 +669,7 @@ in the validation loop, you won't need to potentially wait a full epoch to find
670669

671670
.. note:: Lightning disables gradients, puts model in eval mode and does everything needed for validation.
672671

673-
---------
672+
---
674673

675674
Testing
676675
-------
@@ -741,7 +740,7 @@ You can also run the test from a saved lightning model
741740

742741
.. warning:: .test() is not stable yet on TPUs. We're working on getting around the multiprocessing challenges.
743742

744-
---------
743+
---
745744

746745
Predicting
747746
----------
@@ -842,7 +841,7 @@ Or maybe we have a model that we use to do generation
842841
How you split up what goes in `forward` vs `training_step` depends on how you want to use this model for
843842
prediction.
844843

845-
---------
844+
---
846845

847846
Extensibility
848847
-------------
@@ -903,7 +902,7 @@ you could do your own:
903902
Every single part of training is configurable this way.
904903
For a full list look at `LightningModule <lightning-module.rst>`_.
905904

906-
---------
905+
---
907906

908907
Callbacks
909908
---------
@@ -940,10 +939,10 @@ And pass the callbacks into the trainer
940939
.. note::
941940
See full list of 12+ hooks in the :ref:`callbacks`.
942941

943-
---------
942+
---
944943

945944
.. include:: child_modules.rst
946945

947-
---------
946+
---
948947

949948
.. include:: transfer_learning.rst

docs/source/metrics.rst

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
.. testsetup:: *
22

3+
import torch
34
from torch.nn import Module
45
from pytorch_lightning.core.lightning import LightningModule
56
from pytorch_lightning.metrics import TensorMetric, NumpyMetric
@@ -25,10 +26,6 @@ Example::
2526
# calculates accuracy across all GPUs and all Nodes used in training
2627
accuracy(pred, target)
2728

28-
Out::
29-
30-
tensor(0.7500)
31-
3229
.. warning::
3330
The metrics package is still in development! If we're missing a metric or you find a mistake, please send a PR!
3431
to a few metrics. Please feel free to create an issue/PR if you have a proposed
@@ -228,7 +225,7 @@ Functional Metrics
228225
------------------
229226
Functional metrics can be called anywhere (even used with just plain PyTorch).
230227

231-
.. testcode::
228+
.. code-block:: python
232229
233230
from pytorch_lightning.metrics.functional import accuracy
234231
@@ -238,10 +235,6 @@ Functional metrics can be called anywhere (even used with just plain PyTorch).
238235
# calculates accuracy across all GPUs and all Nodes used in training
239236
accuracy(pred, target)
240237
241-
.. testoutput::
242-
243-
tensor(0.7500)
244-
245238
These metrics even work when using distributed training:
246239

247240
.. code-block:: python

docs/source/optimizers.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ Optimization
22
===============
33

44
Learning rate scheduling
5-
-------------------------------------
5+
------------------------
66
Every optimizer you use can be paired with any `LearningRateScheduler <https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate>`_.
77

88
.. testcode::
@@ -41,7 +41,7 @@ Every optimizer you use can be paired with any `LearningRateScheduler <https://p
4141

4242

4343
Use multiple optimizers (like GANs)
44-
-------------------------------------
44+
-----------------------------------
4545
To use multiple optimizers return > 1 optimizers from :meth:`pytorch_lightning.core.LightningModule.configure_optimizers`
4646

4747
.. testcode::
@@ -73,7 +73,7 @@ Lightning will call each optimizer sequentially:
7373
7474
7575
Step optimizers at arbitrary intervals
76-
----------------------------------------
76+
--------------------------------------
7777
To do more interesting things with your optimizers such as learning rate warm-up or odd scheduling,
7878
override the :meth:`optimizer_step` function.
7979

docs/source/sequences.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Lightning has built in support for dealing with sequential data.
99

1010

1111
Packed sequences as inputs
12-
----------------------------
12+
--------------------------
1313
When using PackedSequence, do 2 things:
1414

1515
1. return either a padded tensor in dataset or a list of variable length tensors in the dataloader collate_fn (example above shows the list implementation).
@@ -29,7 +29,7 @@ When using PackedSequence, do 2 things:
2929
y = rnn.pack_sequence(batch[1], enforce_sorted=False)
3030

3131
Truncated Backpropagation Through Time
32-
---------------------------------------
32+
--------------------------------------
3333
There are times when multiple backwards passes are needed for each batch.
3434
For example, it may save memory to use Truncated Backpropagation Through Time when training RNNs.
3535

@@ -50,7 +50,7 @@ Lightning can handle TBTT automatically via this flag.
5050
a `hiddens` arg.
5151

5252
Iterable Datasets
53-
---------------------------------------
53+
-----------------
5454
Lightning supports using IterableDatasets as well as map-style Datasets. IterableDatasets provide a more natural
5555
option when using sequential data.
5656

docs/source/single_gpu.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
from pytorch_lightning.trainer.trainer import Trainer
44

55
Single GPU Training
6-
====================
6+
===================
77
Make sure you are running on a machine that has at least one GPU. Lightning handles all the NVIDIA flags for you,
88
there's no need to set them yourself.
99

0 commit comments

Comments
 (0)