Skip to content

Commit a0da8d4

Browse files
committed
update
1 parent 23695e8 commit a0da8d4

File tree

3 files changed

+3
-309
lines changed

3 files changed

+3
-309
lines changed

distributed/rpc/parameter_server/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,4 +5,6 @@ This is a basic example of RPC-based training that uses several trainers remotel
55
To run the example locally, run the following command worker for the server and each worker you wish to spawn, in separate terminal windows:
66
`python rpc_parameter_server.py [world_size] [rank] [num_gpus]`. For example, for a master node with world size of 2, the command would be `python rpc_parameter_server.py 2 0 0`. The trainer can then be launched with the command `python rpc_parameter_server.py 2 1 0` in a separate window, and this will begin training with one server and a single trainer.
77

8+
Note that for demonstration purposes, this example supports only between 0-2 GPUs, although the general pattern can be extended to make use of all GPUs available on a node.
9+
810
You can pass in the command line arguments `--master_addr=<address>` and `master_port=PORT` to indicate the address:port that the master worker is listening on. All workers will contact the master for rendezvous during worker discovery. By default, `master_addr` will be `localhost` and `master_port` will be 29500.

distributed/rpc/parameter_server/rpc_param_server.py

Lines changed: 0 additions & 309 deletions
This file was deleted.

distributed/rpc/parameter_server/rpc_parameter_server.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -262,6 +262,7 @@ def run_worker(rank, world_size, num_gpus, train_loader, test_loader):
262262

263263
args = parser.parse_args()
264264
assert args.rank is not None, "must provide rank argument."
265+
assert args.num_gpus <= 3, f"Only 0-2 GPUs currently supported (got {args.num_gpus})."
265266
os.environ['MASTER_ADDR'] = args.master_addr
266267
os.environ["MASTER_PORT"] = args.master_port
267268
processes = []

0 commit comments

Comments
 (0)