Skip to content

Conversation

@panchul
Copy link
Contributor

@panchul panchul commented May 21, 2020

This PR fixes the error in distributed/rpc/pipeline/main.py. Here is how it looks like:

distributed/rpc/pipeline$ python ./main.py
Traceback (most recent call last):
  File "./main.py", line 286, in <module>
    mp.spawn(run_worker, args=(world_size, num_split), nprocs=world_size, join=True)
  File "/anaconda/envs/mypytorchenv_gpu/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 200, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/anaconda/envs/mypytorchenv_gpu/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 158, in start_processes
    while not context.join():
  File "/anaconda/envs/mypytorchenv_gpu/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 119, in join
    raise Exception(msg)
Exception:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/anaconda/envs/mypytorchenv_gpu/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 20, in _wrap
    fn(i, *args)
  File "src/pytorch-examples/distributed/rpc/pipeline/main.py", line 268, in run_worker
    run_master(num_split)
  File "src/pytorch-examples/distributed/rpc/pipeline/main.py", line 231, in run_master
    model = DistResNet50(split_size, ["worker1", "worker2"])
NameError: name 'split_size' is not defined

@soumith soumith merged commit e9b2f8e into pytorch:master May 21, 2020
@soumith
Copy link
Member

soumith commented May 21, 2020

thank you!

YinZhengxun pushed a commit to YinZhengxun/mt-exercise-02 that referenced this pull request Mar 30, 2025
Co-authored-by: Aleksandr Panchul (CSI Interfusion Inc) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants