[WIP] Add `Loop.stop()` #8604

carmocca · 2021-07-28T16:59:00Z

What does this PR do?

Add Loop.stop() to stop a loop and replace trainer.should_stop

Necessary for #8578

Does your PR introduce any breaking changes? If yes, please list them.

TODO

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

codecov · 2021-07-28T17:01:57Z

Codecov Report

Merging #8604 (9548f73) into master (aadd2a9) will decrease coverage by 49%.
The diff coverage is 21%.

@@           Coverage Diff            @@
##           master   #8604     +/-   ##
========================================
- Coverage      92%     44%    -49%     
========================================
  Files         218     169     -49     
  Lines       14407   13965    -442     
========================================
- Hits        13305    6092   -7213     
- Misses       1102    7873   +6771

tchaton · 2021-07-29T07:54:57Z

pytorch_lightning/callbacks/early_stopping.py

        should_stop, reason = self._evalute_stopping_criteria(current)
-
        # stop every ddp process if any world process decides to stop
        should_stop = trainer.training_type_plugin.reduce_boolean_decision(should_stop)


Should we move accelerator reduction within should_stop from the loops ?

Possibly, It would clean the callbacks using it.

# loops/base.py def stop(should_stop: bool = True) -> bool: self.trainer.training_type_plugin.reduce_boolean_decision(should_stop) self.should_stop = should_stop ... return should_stop

Thoughts? @justusschock @awaelchli

pytorch_lightning/callbacks/gpu_stats_monitor.py

pytorch_lightning/loops/base.py

pytorch_lightning/loops/fit_loop.py

tchaton · 2021-07-29T08:01:34Z

@carmocca really like the design :)

awaelchli · 2021-08-27T12:38:03Z

pytorch_lightning/loops/epoch/training_epoch_loop.py

        if is_last_batch and is_infinite_dataset:
            return True

-        if self.trainer.should_stop:


as discussed, we might actually want to get rid of this completely.
it's not well justified why we would want to run validation at the point of stopping.

Sounds good to me !

ananthsub · 2021-08-27T15:39:25Z

pytorch_lightning/callbacks/early_stopping.py

        should_stop = trainer.training_type_plugin.reduce_boolean_decision(should_stop)
-        trainer.should_stop = trainer.should_stop or should_stop
        if should_stop:
+            trainer._active_loop.stop()


this starts to leak the loop internals to the callback. i don't think people should reach into the trainer's internals like this.

returning a signal from the callback hook that the trainer interprets could be an alternative that avoids this

would making _active_loop public or adding def stop() to properties as a utility work for you?

carmocca · 2022-09-28T22:41:55Z

Note: this feature would have avoided the need to add a TunerExitException in #11089 to stop the loops. Additionally being public loops API.

I'm not continuing this work for now as the future of the loops is unclear.

carmocca added feature Is an improvement or enhancement refactor labels Jul 28, 2021

carmocca added this to the v1.5 milestone Jul 28, 2021

carmocca self-assigned this Jul 28, 2021

tchaton reviewed Jul 29, 2021

View reviewed changes

awaelchli reviewed Aug 27, 2021

View reviewed changes

Add Loop.stop()

1f0c08a

carmocca force-pushed the feat/loop-stop branch from 9548f73 to 1f0c08a Compare August 27, 2021 15:31

ananthsub reviewed Aug 27, 2021

View reviewed changes

carmocca added 2 commits August 27, 2021 17:53

Add stop to trainer directly

98fdae0

Un-deprecate should_stop getter

88628f5

awaelchli modified the milestones: v1.5, v1.6 Nov 1, 2021

carmocca mentioned this pull request Nov 26, 2021

Fix current_epoch value on training end #8578

Merged

12 tasks

carmocca mentioned this pull request Feb 14, 2022

Resolve FitLoop.done TODO #11850

Closed

8 tasks

carmocca removed this from the 1.6 milestone Mar 28, 2022

carmocca closed this Dec 14, 2022

carmocca deleted the feat/loop-stop branch December 14, 2022 11:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Add `Loop.stop()` #8604

[WIP] Add `Loop.stop()` #8604

Uh oh!

carmocca commented Jul 28, 2021 •

edited

Loading

Uh oh!

codecov bot commented Jul 28, 2021

Uh oh!

tchaton Jul 29, 2021

Uh oh!

carmocca Aug 27, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tchaton commented Jul 29, 2021

Uh oh!

awaelchli Aug 27, 2021

Uh oh!

tchaton Aug 27, 2021

Uh oh!

ananthsub Aug 27, 2021

Uh oh!

carmocca Aug 27, 2021

Uh oh!

carmocca commented Sep 28, 2022 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[WIP] Add Loop.stop() #8604

[WIP] Add Loop.stop() #8604

Uh oh!

Conversation

carmocca commented Jul 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Uh oh!

codecov bot commented Jul 28, 2021

Codecov Report

Uh oh!

tchaton Jul 29, 2021

Choose a reason for hiding this comment

Uh oh!

carmocca Aug 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tchaton commented Jul 29, 2021

Uh oh!

awaelchli Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

tchaton Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

ananthsub Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

carmocca Aug 27, 2021

Choose a reason for hiding this comment

Uh oh!

carmocca commented Sep 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[WIP] Add `Loop.stop()` #8604

[WIP] Add `Loop.stop()` #8604

carmocca commented Jul 28, 2021 •

edited

Loading

carmocca Aug 27, 2021 •

edited

Loading

carmocca commented Sep 28, 2022 •

edited

Loading