-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Deprecate terminate_on_nan Trainer argument in favor of detect_anomaly
#9175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate terminate_on_nan Trainer argument in favor of detect_anomaly
#9175
Conversation
pytorch_lightning/trainer/connectors/training_trick_connector.py
Outdated
Show resolved
Hide resolved
pytorch_lightning/trainer/connectors/training_trick_connector.py
Outdated
Show resolved
Hide resolved
for more information, see https://pre-commit.ci
…nopixx/pytorch-lightning into feature/8313_detect_anomaly
carmocca
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 questions regarding the difference between terminate_on_nan and detect_anomaly:
- Does anomaly detection raise an error on +- inf?
terminate_on_nandoes:
https://github.com/PyTorchLightning/pytorch-lightning/blob/46b00a7de44d226a3da22b258c74cc2b3cd92bec/pytorch_lightning/loops/utilities.py#L35 - Will we keep the loss value check when
terminate_on_nanis removed?
https://github.com/PyTorchLightning/pytorch-lightning/blob/46b00a7de44d226a3da22b258c74cc2b3cd92bec/pytorch_lightning/loops/batch/training_batch_loop.py#L279-L281
Co-authored-by: Carlos Mocholí <[email protected]>
Anomaly detection doesn't raise an error for infs. Also, since |
In that case, I don't think we should deprecate |
|
Per #9799 we only want to turn on detect_anomaly after the scaler value converges to a stable value:
|
terminate_on_nan Trainer argument in favor of detect_anomaly
for more information, see https://pre-commit.ci
… into feature/8313_detect_anomaly
|
@yopknopixx After #9848 got merged, I am repurposing your PR here for the deprecation of the old |
…maly` (Lightning-AI#9175) Co-authored-by: Carlos Mocholí <[email protected]> Co-authored-by: Adrian Wälchli <[email protected]>
What does this PR do?
Follow up to #8313
Fixes #9849
Closes #9849
Does your PR introduce any breaking changes? If yes, please list them.
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:
Did you have fun?
Make sure you had fun coding 🙃