-
Notifications
You must be signed in to change notification settings - Fork 874
Add img to collective docs #4038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add img to collective docs #4038
Conversation
|
Thanks for your contribution! |
sandyhouse
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
|
||
| 假设将NxM的参数矩阵切分到两个设备device_0和device_1。那么每个设置上的参数矩阵为(N/2+1)行和M列。device_0上,输入x中的值如果介于[0, N/2-1],则其值保持不变;否则值变更为N/2,经过embedding映射为全0值。类似地,device_1上,输入x中的值V如果介于[N/2, N-1]之间,那么这些值将变更为(V-N/2);否则,值变更为N/2,经过embedding映射为全0值。最后,使用all_reduce_sum操作汇聚各个卡上的结果。 | ||
|
|
||
| 单卡Embedding情况如下图所示 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Embedding 单卡 是不是做成乘法的方式好一点 in * [0, 0, 1, ... 0] -> out?并行同理,这样,比较好描述下面的 00....0是怎么来的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
embedding不能单纯的看做乘法,它是类似于查表操作,split embedding相当于将这个表格分成几份放在不同位置,然后对input进行查表的时候,只会对有这个input的表格拿出对应feature,其他未存放这个input的就输出0
|
有个统一的问题是,文档描述里都是进程怎么怎么样,但是图里都是GPU怎么怎么样,这两个概念最好统一下 |
TCChenlong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
统一 GPU、rank 的使用为 rank;
每篇API文档首次出现rank时,需要括号标注下rank(即GPU,后文统一用rank)。
| :align: center | ||
|
|
||
| 情形3:列并行Linear | ||
| Linear操作的参数是个NxM的矩阵,行数为N,列数为M。列并行Linear情形下,参数切分到num_partitions个设备,每个设备上的参数是N行、M/num_partitions列的矩阵。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
行数为N,列数为M,图里全是N;
Add illustrations to some collective communication operators docs.