Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,6 @@ instrinsics = "instrinsics"
interchangable = "interchangable"
intializers = "intializers"
intput = "intput"
lable = "lable"
learing = "learing"
legth = "legth"
lenth = "lenth"
leran = "leran"
libary = "libary"
mantained = "mantained"
matrics = "matrics"
Expand Down
2 changes: 1 addition & 1 deletion docs/api/paddle/static/accuracy_cn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ accuracy

accuracy layer。参考 https://en.wikipedia.org/wiki/Precision_and_recall

使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里,则计算结果加 1。注意:输出正确率的类型由 input 类型决定,input 和 lable 的类型可以不一样。
使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里,则计算结果加 1。注意:输出正确率的类型由 input 类型决定,input 和 label 的类型可以不一样。

参数
::::::::::::
Expand Down
2 changes: 1 addition & 1 deletion docs/design/memory/memory_optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ In compilers, the front end of the compiler translates programs into an intermed

Therefore, the compiler needs to analyze the intermediate-representation program to determine which temporary variables are in use at the same time. We say a variable is "live" if it holds a value that may be needed in the future, so this analysis is called liveness analysis.

We can leran these techniques from compilers. There are mainly two stages to make live variable analysis:
We can learn these techniques from compilers. There are mainly two stages to make live variable analysis:

- construct a control flow graph
- solve the dataflow equations
Expand Down
1,404 changes: 702 additions & 702 deletions docs/practices/gan/cyclegan/cyclegan.ipynb

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/practices/nlp/transformer_in_English-to-Spanish.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1170,7 +1170,7 @@
"source": [
"### 4.2 Encoder\n",
"Encoder部分主要包含了多头注意力机制、归一化层以及前馈神经网络。输入会依次经过多头注意力模块、归一化层构成的残差模块、前馈神经网络模块、归一化层构成的残差模块。\n",
"* 多头注意力机制(MultiHeadAttention):使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制,需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_legth,sequence_legth]。\n",
"* 多头注意力机制(MultiHeadAttention):使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制,需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_length,sequence_length]。\n",
"* 前馈神经网络(Feed Forward):输入经过MultiHeadAttention层后,经过一层feed forward层。模型中的feed forward,采用的是一种position-wise feed-forward的方法,即先对输入加一个全连接网络,之后使用Relu激活,之后再加一个全连接网络。\n",
"* 残差网络:由归一化(LayerNorm)后的结果与之前时刻的输入相加组成。LayerNorm会在每一个样本上计算均值和方差。\n"
]
Expand Down Expand Up @@ -1482,7 +1482,7 @@
" def forward(self, pre, real, trg_mask):\n",
" # 返回的数据类型与pre一致,除了axis维度(未指定则为-1),其他维度也与pre一致\n",
" # logits=pre,[batch_size,sequence_len,word_size],猜测会进行argmax操作,[batch_size,sequence_len,1]\n",
" # 默认的soft_label为False,lable=real,[bacth_size,sequence_len,1]\n",
" # 默认的soft_label为False,label=real,[bacth_size,sequence_len,1]\n",
" cost = paddle.nn.functional.softmax_with_cross_entropy(\n",
" logits=pre, label=real, soft_label=False\n",
" )\n",
Expand Down