Avoid changing current cudnn benchmark #12020

timesler · 2022-02-20T09:14:00Z

What does this PR do?

This is a small change that prevents PL from silently overriding torch.backends.cudnn.benchmark when constructing a Trainer object.

Fixes #12018

With this change, the behaviour can be described by:

For torch.backends.cudnn.benchmark to be changed by Trainer only when the benchmark arg is explicitly set.

Instantiation	Behaviour
`Trainer()`	`torch.backends.cudnn.benchmark` is unchanged from current session value
`Trainer(benchmark=None)`	`torch.backends.cudnn.benchmark` is unchanged from current session value
`Trainer(benchmark=True)`	`torch.backends.cudnn.benchmark` set to `True`
`Trainer(benchmark=False)`	`torch.backends.cudnn.benchmark` set to `False`

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

timesler · 2022-02-20T09:23:06Z

This should likely wait until #11944 is merged, then be updated to ensure the same silent updating of cudnn.benchmark doesn't happen with the deterministic arg as well.

docs/source/common/trainer.rst

pytorch_lightning/trainer/connectors/accelerator_connector.py

Co-authored-by: Adrian Wälchli <[email protected]>

for more information, see https://pre-commit.ci

docs/source/common/trainer.rst

carmocca · 2022-02-28T11:48:22Z

tests/trainer/test_trainer.py

-        (None, False, True),
+        (None, False, None),
        (None, True, False),
+        (None, None, None),


So now, somebody that creates Trainer() with no arguments, won't get cudnn.benchmark=True as it does in master?

I think having benchmark=True by default is best.

The proposal in this PR only works if we were able to know if the user had set benchmark manually before.

The justification for this change is that Pytorch Lightning shouldn't break standard Pytorch functionality, but extend it. In general, PL offers value by establishing sensible default values so users don't need to think about them usually. However, in this case, we are silently setting a global variable, resulting in this behaviour:

import torch from pytorch_lightning import Trainer torch.backends.cudnn.benchmark = False trainer = Trainer(gpus=1) print(torch.backends.cudnn.benchmark)

Output:

True # When it should be False

I understand that, but since there's no way for us to know whether the user previously set the value, we have 2 exclusive options:

Have the better default for most people which may override an existing value (current master)

Always respect the existing value but users have to remember to set this flag (this PR)

I personally prefer 1 as it establishes a "sensible default". We could request the torch folks to add an optional default so the options are not exclusive in the future.

ccing revierwers @awaelchli @ananthsub @krshrimali to see if they think differently.

I understand that, but since there's no way for us to know whether the user previously set the value,

We can know, by making the default value for the argument None. A default Trainer() would result in taking the value from torch.backends.cudnn.benchmark which by default is True

torch.backends.cudnn.benchmark which by default is True

Is it? locally:

$ python -c 'import torch; print(torch.backends.cudnn.benchmark)' False

Co-authored-by: Carlos Mocholí <[email protected]>

stale · 2022-04-16T01:10:00Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need further help see our docs: https://pytorch-lightning.readthedocs.io/en/latest/generated/CONTRIBUTING.html#pull-request or ask the assistance of a core contributor here or on Slack. Thank you for your contributions.

stale · 2022-04-24T04:45:23Z

This pull request is going to be closed. Please feel free to reopen it create a new from the actual master.

timesler added 5 commits January 30, 2022 19:44

Leave current value for cudnn.benchmark unchanged by default

3e6d927

Don't change torch.backends.cudnn.benchmark by default

040fec0

Update docs for benckmark arg to Trainer

4910444

Clarify explanation of benchmark arg

4f4e586

Merge branch 'master' into avoid-changing-current-cudnn-benchmark

3b21017

timesler requested review from Borda, SeanNaren, awaelchli, carmocca, edenlightning, tchaton and williamFalcon as code owners February 20, 2022 09:14

awaelchli added reproducibility trainer: argument labels Feb 20, 2022

krshrimali reviewed Feb 21, 2022

View reviewed changes

docs/source/common/trainer.rst Outdated Show resolved Hide resolved

mergify bot added the has conflicts label Feb 21, 2022

awaelchli approved these changes Feb 26, 2022

View reviewed changes

docs/source/common/trainer.rst Outdated Show resolved Hide resolved

docs/source/common/trainer.rst Outdated Show resolved Hide resolved

docs/source/common/trainer.rst Outdated Show resolved Hide resolved

awaelchli added this to the 1.6 milestone Feb 26, 2022

awaelchli reviewed Feb 26, 2022

View reviewed changes

pytorch_lightning/trainer/connectors/accelerator_connector.py Outdated Show resolved Hide resolved

timesler and others added 3 commits February 27, 2022 14:52

Update docs/source/common/trainer.rst

dbe73af

Co-authored-by: Adrian Wälchli <[email protected]>

Clarify default benchmark behaviour in docs

3039227

Co-authored-by: Adrian Wälchli <[email protected]>

Merge branch 'master' into bugfix/avoid-changing-current-cudnn-benchmark

ebb744b

mergify bot removed the has conflicts label Feb 27, 2022

timesler added 5 commits February 27, 2022 15:53

Fix reference to non-existent benchmark attribute

1d4d6c4

Remove erroneous deterministic comment

75f98c3

Update deterministic docstring entry.

2413c6b

Clarify benchmark docstring.

caaea75

Add benchmark fix to CHANGELOG.

c0db741

timesler requested a review from justusschock as a code owner February 27, 2022 05:02

timesler requested review from kaushikb11 and rohitgr7 as code owners February 27, 2022 05:02

pre-commit-ci bot and others added 3 commits February 27, 2022 05:03

[pre-commit.ci] auto fixes from pre-commit.com hooks

55f96ec

for more information, see https://pre-commit.ci

Fix ref to nonexistent attribute

cf21c54

Update benchmark arg test

5426d1a

carmocca reviewed Feb 28, 2022

View reviewed changes

mergify bot added the has conflicts label Feb 28, 2022

Update docs/source/common/trainer.rst

22fdded

Co-authored-by: Carlos Mocholí <[email protected]>

ananthsub approved these changes Mar 2, 2022

View reviewed changes

Merge branch 'master' into bugfix/avoid-changing-current-cudnn-benchmark

ceaed22

mergify bot added ready PRs ready to be merged and removed has conflicts labels Mar 2, 2022

mergify bot added has conflicts and removed ready PRs ready to be merged labels Mar 27, 2022

carmocca removed this from the 1.6 milestone Mar 28, 2022

stale bot added the won't fix This will not be worked on label Apr 16, 2022

stale bot closed this Apr 24, 2022

This was referenced May 25, 2022

v1.6 is slower than v1.5 #12713

Closed

Avoid changing the current cudnn.benchmark value #13154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid changing current cudnn benchmark #12020

Avoid changing current cudnn benchmark #12020

Uh oh!

timesler commented Feb 20, 2022 •

edited

Loading

Uh oh!

timesler commented Feb 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carmocca Feb 28, 2022

Uh oh!

timesler Mar 2, 2022 •

edited

Loading

Uh oh!

carmocca Mar 2, 2022

Uh oh!

awaelchli Mar 5, 2022

Uh oh!

carmocca Mar 5, 2022

Uh oh!

stale bot commented Apr 16, 2022

Uh oh!

stale bot commented Apr 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Avoid changing current cudnn benchmark #12020

Avoid changing current cudnn benchmark #12020

Uh oh!

Conversation

timesler commented Feb 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

Did you have fun?

Uh oh!

timesler commented Feb 20, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carmocca Feb 28, 2022

Choose a reason for hiding this comment

Uh oh!

timesler Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carmocca Mar 2, 2022

Choose a reason for hiding this comment

Uh oh!

awaelchli Mar 5, 2022

Choose a reason for hiding this comment

Uh oh!

carmocca Mar 5, 2022

Choose a reason for hiding this comment

Uh oh!

stale bot commented Apr 16, 2022

Uh oh!

stale bot commented Apr 24, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

timesler commented Feb 20, 2022 •

edited

Loading

timesler Mar 2, 2022 •

edited

Loading