This repository was archived by the owner on Sep 10, 2025. It is now read-only.
File tree Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Expand file tree Collapse file tree 1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -35,7 +35,7 @@ huggingface-cli login
3535## Enabling Distributed torchchat Inference
3636
3737To enable distributed inference, use the option ` --distributed ` . In addition, ` --tp <num> ` and ` --pp <num> `
38- allow users to specify the types of parallelism to use ( where tp refers to tensor parallelism and pp to pipeline parallelism) .
38+ allow users to specify the types of parallelism to use where tp refers to tensor parallelism and pp to pipeline parallelism.
3939
4040
4141## Generate Output with Distributed torchchat Inference
@@ -52,7 +52,7 @@ This mode allows you to chat with an LLM in an interactive fashion with distribu
5252
5353[ skip default ] : begin
5454``` bash
55- python3 torchchat.py chat llama3.1 --max-new-tokens 10 --distributed --tp 2 --pp 2
55+ python3 torchchat.py chat llama3.1 --max-new-tokens 10 --distributed --tp 2 --pp 2
5656```
5757[ skip default ] : end
5858
@@ -69,7 +69,7 @@ In one terminal, start the server to run with 4 GPUs:
6969[ skip default ] : begin
7070
7171``` bash
72- python3 torchchat.py server llama3.1 --distributed --tp 2 --pp 2
72+ python3 torchchat.py server llama3.1 --distributed --tp 2 --pp 2
7373```
7474[ skip default ] : end
7575
You can’t perform that action at this time.
0 commit comments