Skip to content

Conversation

@drdarshan
Copy link
Contributor

Resolves PyTorch issue #35160 (pytorch/pytorch#35160). The example shows how to structure a DDP application so it an be started via the distributed launcher script in several configurations: one process per GPU, one process per node or anything in between.

Copy link
Contributor

@mrshenli mrshenli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!! Thanks for putting this together in such a short period of time!! I left some minor comments inline.

Fix typos and make a few grammatical improvements.
DDP now broadcasts the initial model from rank 0 so it's no longer necessary to randomly initialize it on all ranks with the same random seed
Per suggestion on PR, uploaded and replaced the SVG image with a GitHub permalink
@drdarshan drdarshan requested a review from mrshenli April 14, 2020 16:52
@jlin27 jlin27 merged commit 8dcb9c7 into pytorch:master May 20, 2020
YinZhengxun pushed a commit to YinZhengxun/mt-exercise-02 that referenced this pull request Mar 30, 2025
Add example for distributed launcher
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants