Skip to content

Conversation

@george-gca
Copy link
Contributor

Signed-off-by: gca [email protected]

What does this PR do?

Fixes #6117. It tries to clarify the documentation regarding monitoring metric values.

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Always :)

Save the model after every epoch by monitoring a quantity. Every metric logged with
:meth:`~pytorch_lightning.core.lightning.log` or :meth:`~pytorch_lightning.core.lightning.log_dict` in
LightningModule is a candidate for the monitor key. For more information, see
:ref:`common/weights_loading:Checkpoint saving`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not 100% sure, but I think we can only reference sections by title within the same document.
You probably have to refer to just :ref:_weights_loading

@awaelchli awaelchli added the docs Documentation related label Apr 7, 2021
class ModelCheckpoint(Callback):
r"""
Save the model after every epoch by monitoring a quantity.
Save the model after every epoch by monitoring a quantity. Every metric logged with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an old version of Lightning? the "after every epoch" isn't true - there are more options being added for greater flexibility. See https://github.com/PyTorchLightning/pytorch-lightning/blob/19e67d18c472c3a03dec4dd9bfcef031e9ca8719/pytorch_lightning/callbacks/model_checkpoint.py#L96-L107

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it was like that in the documentation, that's why I didn't change the after every epoch part. But you are right. What do you think about this before the part that I added:

Save the model after every every_n_train_steps training steps or every every_n_val_epochs validation epochs by monitoring a quantity. These options are mutually exclusive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not mention the arguments before they are introduced later on.
How about:

Save the model periodically by monitoring a quantity. ...

@carmocca carmocca added this to the 1.3 milestone Apr 7, 2021
@george-gca george-gca requested a review from kaushikb11 as a code owner April 26, 2021 15:55
@awaelchli awaelchli added the ready PRs ready to be merged label Apr 27, 2021
@awaelchli awaelchli enabled auto-merge (squash) April 27, 2021 09:46
@codecov
Copy link

codecov bot commented Apr 27, 2021

Codecov Report

Merging #6873 (ae39828) into master (7a48db5) will decrease coverage by 4%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #6873    +/-   ##
=======================================
- Coverage      91%     87%    -4%     
=======================================
  Files         199     199            
  Lines       12799   12799            
=======================================
- Hits        11701   11168   -533     
- Misses       1098    1631   +533     

@carmocca carmocca disabled auto-merge April 28, 2021 23:34
@carmocca carmocca enabled auto-merge (squash) April 28, 2021 23:34
@carmocca carmocca merged commit e272bea into Lightning-AI:master Apr 28, 2021
@george-gca george-gca deleted the doc/model_checkpoint branch April 30, 2021 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Documentation related ready PRs ready to be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ModelCheckpoint is not saving top k models

4 participants