PaddlePaddle · Echo-Nie · Nov 12, 2025 · Nov 3, 2025 · Nov 4, 2025 · Nov 4, 2025
diff --git a/_typos.toml b/_typos.toml
@@ -43,11 +43,6 @@ instrinsics = "instrinsics"
 interchangable = "interchangable"
 intializers = "intializers"
 intput = "intput"
-lable = "lable"
-learing = "learing"
-legth = "legth"
-lenth = "lenth"
-leran = "leran"
 libary = "libary"
 mantained = "mantained"
 matrics = "matrics"

@@ -10,7 +10,7 @@ accuracy
 
 accuracy layer。参考 https://en.wikipedia.org/wiki/Precision_and_recall
 
-使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里，则计算结果加 1。注意：输出正确率的类型由 input 类型决定，input 和 lable 的类型可以不一样。
+使用输入和标签计算准确率。如果正确的标签在 topk 个预测值里，则计算结果加 1。注意：输出正确率的类型由 input 类型决定，input 和 label 的类型可以不一样。
 
 参数
 ::::::::::::

diff --git a/docs/design/memory/memory_optimization.md b/docs/design/memory/memory_optimization.md
@@ -53,7 +53,7 @@ In compilers, the front end of the compiler translates programs into an intermed
 
 Therefore, the compiler needs to analyze the intermediate-representation program to determine which temporary variables are in use at the same time. We say a variable is "live" if it holds a value that may be needed in the future, so this analysis is called liveness analysis.
 
-We can leran these techniques from compilers. There are mainly two stages to make live variable analysis:
+We can learn these techniques from compilers. There are mainly two stages to make live variable analysis:
 
 - construct a control flow graph
 - solve the dataflow equations

diff --git a/docs/practices/gan/cyclegan/cyclegan.ipynb b/docs/practices/gan/cyclegan/cyclegan.ipynb
diff --git a/docs/practices/nlp/transformer_in_English-to-Spanish.ipynb b/docs/practices/nlp/transformer_in_English-to-Spanish.ipynb
@@ -1170,7 +1170,7 @@
    "source": [
     "### 4.2 Encoder\n",
     "Encoder部分主要包含了多头注意力机制、归一化层以及前馈神经网络。输入会依次经过多头注意力模块、归一化层构成的残差模块、前馈神经网络模块、归一化层构成的残差模块。\n",
-    "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_legth,sequence_legth]。\n",
+    "* 多头注意力机制（MultiHeadAttention）：使用[paddle.nn.MultiHeadAttention](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/nn/MultiHeadAttention_cn.html#multiheadattention)实现多头注意力机制，需要注意其掩码attn_mask需要的shape是[batch_szie,num_heads,sequence_length,sequence_length]。\n",
     "* 前馈神经网络（Feed Forward）：输入经过MultiHeadAttention层后，经过一层feed forward层。模型中的feed forward，采用的是一种position-wise feed-forward的方法，即先对输入加一个全连接网络，之后使用Relu激活，之后再加一个全连接网络。\n",
     "* 残差网络：由归一化（LayerNorm）后的结果与之前时刻的输入相加组成。LayerNorm会在每一个样本上计算均值和方差。\n"
    ]
@@ -1482,7 +1482,7 @@
     "    def forward(self, pre, real, trg_mask):\n",
     "        # 返回的数据类型与pre一致，除了axis维度(未指定则为-1)，其他维度也与pre一致\n",
     "        # logits=pre,[batch_size,sequence_len,word_size],猜测会进行argmax操作，[batch_size,sequence_len,1]\n",
-    "        # 默认的soft_label为False，lable=real,[bacth_size,sequence_len,1]\n",
+    "        # 默认的soft_label为False，label=real,[bacth_size,sequence_len,1]\n",
     "        cost = paddle.nn.functional.softmax_with_cross_entropy(\n",
     "            logits=pre, label=real, soft_label=False\n",
     "        )\n",