Skip to content

Conversation

@carmocca
Copy link
Contributor

@carmocca carmocca commented Feb 18, 2022

What does this PR do?

A CLI test using different strategies (ddp spawn, ddp) was silently broken accidentally in this PR:
https://github.com/PyTorchLightning/pytorch-lightning/pull/9931/files#diff-99e49066f136ed08cfc2c4ed81ff44eb71414ab3e600a62d278e4bd613a67997R584-R585
due to a bug in the AcceleratorConnector

Since the test was not working as expected, #10896 broke the SaveConfigCallback without noticing because now setup is called AFTER processes are spawned.

Then, #11448 fixed the AcceleratorConnector bug which uncovered the failing CLI test.

This PR fixes the test and uses an internal hook (aka a hack) to support DDP spawn again.
Note that the bug is unreleased.

A longer-term fix will be to make ArgumentParser pickleable or to have the parser expose a static save function. @mauvilsa

Does your PR introduce any breaking changes? If yes, please list them.

None.

Before submitting

  • [n/a] Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • [n/a] Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

cc @carmocca @mauvilsa @justusschock @kaushikb11 @awaelchli @akihironitta @rohitgr7

@carmocca carmocca added this to the 1.6 milestone Feb 18, 2022
@carmocca carmocca self-assigned this Feb 18, 2022
@carmocca carmocca closed this Feb 21, 2022
@carmocca carmocca deleted the bugfix/cli-ddp-spawn branch February 21, 2022 13:07
@awaelchli awaelchli added strategy: ddp DistributedDataParallel and removed strategy: ddp spawn labels Nov 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lightningcli pl.cli.LightningCLI strategy: ddp DistributedDataParallel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants