File tree Expand file tree Collapse file tree 2 files changed +46
-0
lines changed
doc/fluid/api/api_guides/low_level/layers Expand file tree Collapse file tree 2 files changed +46
-0
lines changed Original file line number Diff line number Diff line change 1414 loss_function.rst
1515 data_in_out.rst
1616 control_flow.rst
17+ sparse_update.rst
1718
Original file line number Diff line number Diff line change 1+ .. _api_guide_sparse_update :
2+
3+ #####
4+ 稀疏更新
5+ #####
6+
7+ Fluid的 :ref: `api_fluid_layers_embedding ` 层在单机训练和分布式训练时,均可以支持“稀疏更新”,即梯度以sparse tensor 结构存储,只保存梯度不为0的行。
8+ 在分布式训练中,对于较大的embedding层,开启稀疏更新有助于减少通信数据量,提升训练速度。
9+
10+ 在paddle内部,我们用lookup_table来实现embedding。下边这张图说明了embedding在正向和反向计算的过程:
11+
12+ 如图所示:一个Tensor中有两行不为0,正向计算的过程中,我们使用ids存储不为0的行,并使用对应的两行数据来进行计算;反向更新的过程也只更新这两行。
13+
14+ .. image :: ../../../../images/lookup_table_training.png
15+ :scale: 50 %
16+
17+ embedding使用例子:
18+ ---------------------
19+
20+ API详细使用方法参考 :ref: `api_fluid_layers_embedding ` ,以下是一个简单的例子:
21+
22+ .. code-block :: python
23+
24+ DICT_SIZE = 10000 * 10
25+ EMBED_SIZE = 64
26+ IS_SPARSE = False
27+ def word_emb (word , dict_size = DICT_SIZE , embed_size = EMBED_SIZE ):
28+ embed = fluid.layers.embedding(
29+ input = word,
30+ size = [dict_size, embed_size],
31+ dtype = ' float32' ,
32+ param_attr = fluid.ParamAttr(
33+ initializer = fluid.initializer.Normal(scale = 1 / math.sqrt(dict_size))),
34+ is_sparse = IS_SPARSE ,
35+ is_distributed = False )
36+ return embed
37+
38+ 以上参数中:
39+
40+ - :code: `is_sparse ` : 反向计算的时候梯度是否为sparse tensor。如果不设置,梯度是一个 `LodTensor <https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/user_guides/howto/prepare_data/lod_tensor.md >`_ 。默认为False。
41+
42+ - :code: `is_distributed ` : 标志是否是用在分布式的场景下。一般大规模稀疏更新(embedding的第0维维度很大,比如几百万以上)才需要设置。具体可以参考大规模稀疏的API guide :ref: `api_guide_async_training ` 。默认为False。
43+
44+ - API汇总:
45+ - :ref: `api_fluid_layers_embedding `
You can’t perform that action at this time.
0 commit comments