-
Notifications
You must be signed in to change notification settings - Fork 3.6k
deprecate passing ModelCheckpoint instance to Trainer(checkpoint_callback=...) #4336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @awaelchli! Thanks for updating this PR. There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-10-30 03:12:23 UTC |
| self, | ||
| logger: Union[LightningLoggerBase, Iterable[LightningLoggerBase], bool] = True, | ||
| checkpoint_callback: Union[ModelCheckpoint, bool] = True, | ||
| checkpoint_callback: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Why can we not just remove the checkpoint_callback argument all together?
A: Lightning philosophy is to provide good defaults, this includes checkpointing. We need a way to turn it off though, i.e., checkpoint_callback=False
What about keeping the checkpoint_callback parameter but changing its default value to False? I think it will be pretty annoying to have to set checkpoint_callback=False every time you pass a custom ModelCheckpoint via callbacks. And I think most people use a custom ModelCheckpoint instead of just checkpoint_callback=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about keeping the checkpoint_callback parameter but changing its default value to False?
doesn't solve the problem I'm trying to solve here, which is eliminate ambiguity when restoring the state of trainer. see answer of 2nd FAQ question.
pretty annoying to have to set checkpoint_callback=False every time you pass a custom ModelCheckpoint
with this PR proposal, the value will be ignored if you pass in a custom one. False is only needed when you want to disable checkpointing completely. I believe I have this covered in a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with @carmocca , this is super confusing when adding my own checkpoint callback. given how loose the default checkpoint callback is, and with the coming customizations, I'd rather drop the checkpoint_callback arg altogether and force everything to be configured through callbacks. Given the callback implementation already exists, I personally don't think it's much of a request for people to instantiate the checkpoint callback (and confirm their settings while doing so) and pass it along to the trainer.
I also think that's a nice message for users: "See how extensible this framework is" vs "look at all the magic this trainer configures for you which you can't change"
Even if that's not in this PR, it feels inevitable that checkpoint_callback=False would eventually be the new default and then later we could drop the arg altogether
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes fine with me, I don't have strong preference here.
Note that this PR does NOT close the disussion on whether there should be a checkpoint_callback arg or not.
I'm simply restricting what can be passed to the argument.
It looks like a lot of api change, but it is really more a bugfix.
|
@williamFalcon @teddykoker remove ambiguity by making checkpoint_callback boolean. sg? |
| trainer.weights_summary = None # not needed before full run | ||
| trainer.logger = DummyLogger() | ||
| trainer.callbacks = [] # not needed before full run | ||
| trainer.checkpoint_callback = False # required for saving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed these from Tuner because ModelCheckpoint now entirely lives in callbacks list, and this is properly backed up by Tuner already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @SkafteNicki
Codecov Report
@@ Coverage Diff @@
## master #4336 +/- ##
=======================================
+ Coverage 91% 93% +2%
=======================================
Files 113 111 -2
Lines 8323 8138 -185
=======================================
- Hits 7588 7563 -25
+ Misses 735 575 -160 |
Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: Jeff Yang <[email protected]>
s-rog
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the change makes checkpoint_callback a bool, and to add args to it (monitor, top_k, dir, etc...) a custom one is to be passed into trainer with callback=[checkpoint_callback]?
If so, the warning in trainer docs: Only user defined callbacks (ie: Not ModelCheckpoint) need to be removed as well (trainer/init.py line 490)
|
@s-rog Yes correct. I have already updated these docs you mention on another branch, which will follow up to this. I didn't want to make the PR too big, so keeping docs updates separate. |
…back=...) (#4336) * first attempt * update tests * support multiple * test bugfix * changelog * pep * pep * import order * import * improve test for resuming * test * update test * add references test Co-authored-by: Carlos Mocholí <[email protected]> * docstring suggestion deprecation Co-authored-by: Jeff Yang <[email protected]> * paramref Co-authored-by: Carlos Mocholí <[email protected]> Co-authored-by: Jeff Yang <[email protected]> (cherry picked from commit d1234c5)
What does this PR do?
Pitch: Deprecate passing
ModelCheckpointinstance directly tocheckpoint_callbacksargument. Allow only bool.FAQ
Q: Why can we not just remove the
checkpoint_callbackargument all together?A: Lightning philosophy is to provide good defaults, this includes checkpointing. We need a way to turn it off though, i.e.,
checkpoint_callback=FalseQ: Why can't we keep it the way it is?
A: When we add
Trainer(checkpoint_callback=mycustomcallback), internally Trainer adds it toself.callbacks.A problem arises when we reload the Trainer using
resume_from_checkpoint="...". In this case, we would end up with a conflict and a decision needs to be made: Do we load the callback that was saved in the checkpoint, or do we overwrite it with the one that is passed in to Trainer. We can try to handle this case and make the decision for the user (#4027) or we go with this PR here to remove the ambiguity alltogether.Alternatives
checkpoint_callback. This implies no default checkpointing.checkpoint_callbackand iftrainer.callbacksis empty, add a ModelCheckpoint automatically. If any other callbacks need to be passed in, we need to supply a ModelCheckpoint manually.TODOs for this PR:
Fixes #4014 automatically
Maybe Fixes #4386 (need to reproduce first)
Closes #3990
Closes #4027
Related to #4335, as it is a good step in to the direction of allowing multiple model checkpoints.