Skip to content

Conversation

@gongweibao
Copy link
Collaborator

@gongweibao gongweibao commented Nov 27, 2018

No description provided.

@shanyi15
Copy link
Contributor

hi,请截图完整的文档页面,从上到下,谢谢。chrome浏览器有这种工具,例如fireshot

@shanyi15 shanyi15 added the API Guide docs related to API Guide label Nov 27, 2018
稀疏更新
#####

在paddle里,我们提供了embedding接口来支持稀疏更新。他在内部表示为lookup_table operator,他的计算原理为:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • embedding接口请加内链
  • 两个“他”-》它

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

在paddle里,我们提供了embedding接口来支持稀疏更新。他在内部表示为lookup_table operator,他的计算原理为:

.. image:: ../../../../images/lookup_table_training.png
:scale: 50 %
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 请问这个图,是自己画的,还是网上的呢?
  • 计算原理,不能光放图,请进行必要的文字说明。不然看不懂图。
  • 能否用embeding接口示例,还对应说明下图呢?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

咱们自己的图。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为了方便用户理解,建议放上原来的图哈~


- input:

input是一个paddle的Variable, 其内容为需要查询的id向量。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

21行直接放在19行后面,不用另起一行,下同。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

试了一下格式会乱掉。

input是一个paddle的Variable, 其内容为需要查询的id向量。
- size:

size为lookup table的shape,必须为两维。以NLP应用为例,第0一般为词典的大小,第一维一般为每个词对应向量的大小。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 第0维(少了维字)
  • 第0维,第一维,请统一

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


- is_sparse:

反向计算的时候梯度是否为sparse tensor。如果不设置,梯度是一个LodTensor。默认为False。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sparse tensor给内链。 @shanyi15 我们并没有sparse tensor的文档介绍, @gongweibao 需要补充下sparse tensor的基本概念么?它和LodTensor有什么区别。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

暂时先把我们的设计文档链接上了。


- is_distributed:

标志是否是用在分布式的场景下。一般大规模稀疏更新(embedding的第0维维度很大,比如几百万以上)才需要设置。具体可以参考大规模稀疏的API guide。默认为False。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

大规模稀疏的API guide:请放内链

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个龙飞的文档还没有提。。。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

稀疏更新
#####

在paddle里,我们提供了 :ref:`api_fluid_layers_embedding` 接口来支持稀疏更新。它在内部表示为lookup_table operator,可以在 `DesignDoc <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/dist_train/distributed_lookup_table_design.md>`_ 看到起设计原理
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fluid的fluid.layers.embedding层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以SelectedRows结构存储,只保存梯度不为0的行。
...

在分布式训练中,对于较大的embedding层,开启稀疏更新有助于减少通信数据量,提升训练速度

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


- input:

input是一个paddle的Variable, 其内容为需要查询的id向量。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle => Fluid

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

embedding输入参数:
---------------------

embedding需要输入(input),形状(size),是否需要稀疏更新(is_sparse),是否分布式(is_distributed),是否padding输出(padding_idx),参数属性(param_attr),数据类型(dtype)来决定如何计算。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否分布式(is_distributed) => 是否使用分布式table(is_distributed)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


embedding需要输入(input),形状(size),是否需要稀疏更新(is_sparse),是否分布式(is_distributed),是否padding输出(padding_idx),参数属性(param_attr),数据类型(dtype)来决定如何计算。

- input:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这里可以不用详细说明参数含义了,只需要说明,在分布式场景下,配置is_sparse和is_distributed参数的含义以及layer并不需要做额外改动即可使用分布式稀疏更新。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加了个例子。Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,42 @@
.. _api_guide_conv:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个。。。文档放到这两个地方应该都可以?

@gongweibao
Copy link
Collaborator Author

image

Fluid的 :ref:`api_fluid_layers_embedding` 层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以 `SelectedRows <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/modules/selected_rows.md>`_ 结构存储,只保存梯度不为0的行。
在分布式训练中,对于较大的embedding层,开启稀疏更新有助于减少通信数据量,提升训练速度

embedding输入参数:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

参数后的冒号可以去掉~

反向计算的时候梯度是否为 `sparse tensor <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/modules/selected_rows.md>`_ 。如果不设置,梯度是一个 `LodTensor <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/lod_tensor.md>`_ 。默认为False。
- is_distributed:

标志是否是用在分布式的场景下。一般大规模稀疏更新(embedding的第0维维度很大,比如几百万以上)才需要设置。具体可以参考大规模稀疏的API guide :ref:`api_guide_async_training` 。默认为False。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
标志是否是用在分布式的场景下。一般大规模稀疏更新(embedding的第0维维度很大,比如几百万以上)才需要设置。具体可以参考大规模稀疏的API guide :ref:`api_guide_async_training` 。默认为False。
标志是否用在分布式的场景下。一般大规模稀疏更新(embedding的第0维维度很大,比如几百万以上)才需要设置。具体可以参考大规模稀疏的API guide :ref:`api_guide_async_training` 。默认为False。

@luotao1 luotao1 mentioned this pull request Nov 30, 2018
Copy link
Contributor

@shanyi15 shanyi15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

稀疏更新
#####

Fluid的 :ref:`api_fluid_layers_embedding` 层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以 `SelectedRows <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/modules/selected_rows.md>`_ 结构存储,只保存梯度不为0的行。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • 这里SelectRows放设计文档合适么?这篇设计文档不是官方正式release的。
  • 原来的那张图能回来么?加些语句说明即可。比放设计文档好。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.谢谢。

size为lookup table的shape,必须为两维。以NLP应用为例,第0维一般为词典的大小,第1维一般为每个词对应向量的大小。
- is_sparse:

反向计算的时候梯度是否为 `sparse tensor <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/modules/selected_rows.md>`_ 。如果不设置,梯度是一个 `LodTensor <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/lod_tensor.md>`_ 。默认为False。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@shanyi15 shanyi15 Dec 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否可以写成内链的形式,例如
学习资料是一篇markdown格式的文档
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shanyi15 可以参考下面文档头三行,将md认为rst,加上内链标记
https://raw.githubusercontent.com/PaddlePaddle/Paddle/develop/doc/v2/howto/cmd_parameter/detail_introduction_en.md

..  _cmd_detail_introduction:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以先用外链链接。


embedding需要输入(input),形状(size),是否需要稀疏更新(is_sparse),是否分布式(is_distributed),是否padding输出(padding_idx),参数属性(param_attr),数据类型(dtype)来决定如何计算。

- input:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gongweibao
Copy link
Collaborator Author

image

Copy link
Collaborator

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

图加几句说明吧

#####

Fluid的 :ref:`api_fluid_layers_embedding` 层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以sparse tensor 结构存储,只保存梯度不为0的行。
在分布式训练中,对于较大的embedding层,开启稀疏更新有助于减少通信数据量,提升训练速度
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

第8行缺句号。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Fluid的 :ref:`api_fluid_layers_embedding` 层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以sparse tensor 结构存储,只保存梯度不为0的行。
在分布式训练中,对于较大的embedding层,开启稀疏更新有助于减少通信数据量,提升训练速度

<<<<<<< HEAD
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

合并的不太对吧,怎么还有HEAD在呢?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

和本地版本不一致。已经push -f

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢。

Copy link
Collaborator

@luotao1 luotao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gongweibao gongweibao merged commit a58b214 into PaddlePaddle:develop Dec 3, 2018
@gongweibao gongweibao deleted the sparseupdate branch December 3, 2018 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Guide docs related to API Guide

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants