PaddlePaddle · TCChenlong · May 26, 2022 · Apr 30, 2022 · Apr 30, 2022 · Apr 30, 2022
@@ -1,34 +1,29 @@
-.. _cn_api_paddle_vision_models_alexnet:
+.. _cn_api_paddle_vision_models_AlexNet:
 
-alexnet
+AlexNet
 -------------------------------
 
-.. py:function:: paddle.vision.models.alexnet(pretrained=False, **kwargs)
+.. py:function:: paddle.vision.models.AlexNet(num_classes=1000)
 
- AlexNet模型，来自论文 `"ImageNet Classification with Deep Convolutional Neural Networks" <https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf>`_ 。
+AlexNet模型，来自论文 `"ImageNet Classification with Deep Convolutional Neural Networks" <https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf>`_ 。
 
 参数
 :::::::::
-  - **pretrained** (bool，可选) - 是否加载在imagenet数据集上的预训练权重。默认值：False。
+  - **num_classes** (int, 可选) - 最后一个全连接层输出的维度。默认值：1000。
 
 返回
 :::::::::
-alexnet模型，Layer的实例。
+AlexNet模型，Layer的实例。
 
 代码示例
 :::::::::
-.. code-block:: python
 
+.. code-block:: python
+
     import paddle
-    from paddle.vision.models import alexnet
-
+    from paddle.vision.models import AlexNet
     # build model
-    model = alexnet()
-
-    # build model and load imagenet pretrained weight
-    # model = alexnet(pretrained=True)
-
+    model = AlexNet()
     x = paddle.rand([1, 3, 224, 224])
     out = model(x)
-
     print(out.shape)
@@ -1,15 +1,16 @@
-.. _cn_api_paddle_vision_models_googlenet:
+.. _cn_api_paddle_vision_models_GoogLeNet:
 
-googlenet
+GoogLeNet
 -------------------------------
 
-.. py:function:: paddle.vision.models.googlenet(pretrained=False, **kwargs)
+.. py:function:: paddle.vision.models.GoogLeNet(num_classes=1000, with_pool=True)
 
- GoogLeNet（Inception v1）模型，来自论文 `"Going Deeper with Convolutions" <https://arxiv.org/pdf/1409.4842.pdf>`_ 。
+GoogLeNet（Inception v1）模型，来自论文 `"Going Deeper with Convolutions" <https://arxiv.org/pdf/1409.4842.pdf>`_ 。
 
 参数
 :::::::::
-  - **pretrained** (bool，可选) - 是否加载在imagenet数据集上的预训练权重。默认值：False。
+  - **num_classes** (int, 可选) - 最后一个全连接层输出的维度。如果该值小于0，则不定义最后一个全连接层。默认值：1000。
+  - **with_pool** (bool，可选) - 是否定义最后一个全连接层之前的池化层。默认值：True。
 
 返回
 :::::::::
@@ -18,17 +19,14 @@ GoogLeNet模型，Layer的实例。
 代码示例
 :::::::::
 .. code-block:: python
-
+    
     import paddle
-    from paddle.vision.models import googlenet
-
+    from paddle.vision.models import GoogLeNet
+    
     # build model
-    model = googlenet()
-
-    # build model and load imagenet pretrained weight
-    # model = googlenet(pretrained=True)
-
+    model = GoogLeNet()
+
     x = paddle.rand([1, 3, 224, 224])
     out, out1, out2 = model(x)
-
+    
     print(out.shape)
@@ -1,40 +1,43 @@
-.. _cn_api_vision_transforms_pad:
+.. _cn_api_vision_transforms_Pad:
 
-pad
+Pad
 -------------------------------
 
-.. py:function:: paddle.vision.transforms.pad(img, padding, fill=0, padding_mode='constant')
+.. py:class:: paddle.vision.transforms.Pad(padding, fill=0, padding_mode='constant', keys=None)
 
 使用特定的模式和值来对输入图像进行填充。
 
 参数
 :::::::::
 
-    - img (PIL.Image|np.ndarray) - 被填充的图像。
     - padding (int|list|tuple) -   在图像边界上进行填充的范围。如果提供的是单个int值，则该值用于填充图像所有边；如果提供的是长度为2的元组/列表，则分别为图像左/右和顶部/底部进行填充；如果提供的是长度为4的元组/列表，则按照左，上，右和下的顺序为图像填充。
-    - fill (int|tuple) - 用于填充的像素值。仅当padding_mode为constant时参数值有效。 默认值：0。 如果参数值是一个长度为3的元组，则会分别用于填充R，G，B通道。
+    - fill (int|list|tuple) - 用于填充的像素值。仅当padding_mode为constant时参数值有效。 默认值：0。 如果参数值是一个长度为3的元组，则会分别用于填充R，G，B通道。
     - padding_mode (string) - 填充模式。支持: constant, edge, reflect 或 symmetric。 默认值：constant。 ``constant`` 表示使用常量值进行填充，该值由fill参数指定。``edge`` 表示使用图像边缘像素值进行填充。``reflect`` 表示使用原图像的镜像值进行填充（不使用边缘上的值）；比如：使用该模式对 ``[1, 2, 3, 4]`` 的两端分别填充2个值，结果是 ``[3, 2, 1, 2, 3, 4, 3, 2]``。``symmetric`` 表示使用原图像的镜像值进行填充（使用边缘上的值）；比如：使用该模式对 ``[1, 2, 3, 4]`` 的两端分别填充2个值，结果是 ``[2, 1, 1, 2, 3, 4, 4, 3]``。
+    - keys (list[str]|tuple[str], optional) - 与 ``BaseTransform`` 定义一致。默认值: None。
+
+形状
+:::::::::
+
+    - img (PIL.Image|np.ndarray|Paddle.Tensor) - 输入的图像数据，数据格式为'HWC'。
+    - output (PIL.Image|np.ndarray|Paddle.Tensor) - 返回填充后的图像数据。
 
 返回
 :::::::::
 
-    ``PIL.Image 或 numpy.ndarray``，填充后的图像。
+计算 ``Pad`` 的可调用对象。
 
 代码示例
 :::::::::
-
-.. code-block:: python
 
+.. code-block:: python
+
     import numpy as np
     from PIL import Image
-    from paddle.vision.transforms import functional as F
-
-    fake_img = (np.random.rand(256, 300, 3) * 255.).astype('uint8')
+    from paddle.vision.transforms import Pad
 
-    fake_img = Image.fromarray(fake_img)
+    transform = Pad(2)
 
-    padded_img = F.pad(fake_img, padding=1)
-    print(padded_img.size)
+    fake_img = Image.fromarray((np.random.rand(224, 224, 3) * 255.).astype(np.uint8))
 
-    padded_img = F.pad(fake_img, padding=(2, 1))
-    print(padded_img.size)
+    fake_img = transform(fake_img)
+    print(fake_img.size)
@@ -1,17 +1,16 @@
-.. _cn_api_vision_transforms_resize:
+.. _cn_api_vision_transforms_Resize:
 
-resize
+Resize
 -------------------------------
 
-.. py:function:: paddle.vision.transforms.resize(img, size, interpolation='bilinear')
+.. py:class:: paddle.vision.transforms.Resize(size, interpolation='bilinear', keys=None)
 
 将输入数据调整为指定大小。
 
 参数
 :::::::::
 
-    - img (numpy.ndarray|PIL.Image) - 输入数据，可以是(H, W, C)形状的图像或遮罩。
-    - size (int|tuple) - 输出图像大小。如果size是一个序列，例如（h，w），输出大小将与此匹配。如果size为int，图像的较小边缘将与此数字匹配，即如果 height > width，则图像将重新缩放为(size * height / width, size)。
+    - size (int|list|tuple) - 输出图像大小。如果size是一个序列，例如（h，w），输出大小将与此匹配。如果size为int，图像的较小边缘将与此数字匹配，即如果 height > width，则图像将重新缩放为(size * height / width, size)。
     - interpolation (int|str, optional) - 插值的方法，默认值: 'bilinear'。
         - 当使用 ``pil`` 作为后端时, 支持的插值方法如下
             + "nearest": Image.NEAREST, 
@@ -26,30 +25,32 @@ resize
             + "area": cv2.INTER_AREA, 
             + "bicubic": cv2.INTER_CUBIC, 
             + "lanczos": cv2.INTER_LANCZOS4。
+
+    - keys (list[str]|tuple[str], optional) - 与 ``BaseTransform`` 定义一致。默认值: None。
+
+形状
+:::::::::
+
+    - img (PIL.Image|np.ndarray|Paddle.Tensor) - 输入的图像数据，数据格式为'HWC'。
+    - output (PIL.Image|np.ndarray|Paddle.Tensor) - 返回调整大小后的图像数据。
 
 返回
 :::::::::
 
-    ``PIL.Image 或 numpy.ndarray``，调整大小后的图像数据。
+计算 ``Resize`` 的可调用对象。
 
 代码示例
 :::::::::
 
 .. code-block:: python
-    
+
     import numpy as np
     from PIL import Image
-    from paddle.vision.transforms import functional as F
-
-    fake_img = (np.random.rand(256, 300, 3) * 255.).astype('uint8')
+    from paddle.vision.transforms import Resize
 
-    fake_img = Image.fromarray(fake_img)
+    transform = Resize(size=224)
 
-    converted_img = F.resize(fake_img, 224)
-    print(converted_img.size)
-    # (262, 224)
+    fake_img = Image.fromarray((np.random.rand(100, 120, 3) * 255.).astype(np.uint8))
 
-    converted_img = F.resize(fake_img, (200, 150))
-    print(converted_img.size)
-    # (150, 200)
-
+    fake_img = transform(fake_img)
+    print(fake_img.size)
@@ -1,30 +1,25 @@
 .. _cn_api_vision_transforms_normalize:
 
-Normalize
+normalize
 -------------------------------
 
-.. py:class:: paddle.vision.transforms.Normalize(mean=0.0, std=1.0, data_format='CHW', to_rgb=False, keys=None)
+.. py:function:: paddle.vision.transforms.normalize(img, mean, std, data_format='CHW', to_rgb=False)
 
-用均值和标准差归一化输入数据。给定n个通道的均值(M1,...,Mn)和方差(S1,..,Sn)，Normalize会在每个通道归一化输入数据。output[channel] = (input[channel] - mean[channel]) / std[channel]
+用均值和标准差归一化输入数据。
 
 参数
 :::::::::
 
+    - img (PIL.Image|np.array|paddle.Tensor) - 用于归一化的数据。
     - mean (list|tuple) - 用于每个通道归一化的均值。
     - std (list|tuple) - 用于每个通道归一化的标准差值。
     - data_format (str, optional): 数据的格式，必须为 'HWC' 或 'CHW'。 默认值: 'CHW'。
     - to_rgb (bool, optional) - 是否转换为 ``rgb`` 的格式。默认值：False。
 
-形状
-:::::::::
-
-    - img (PIL.Image|np.ndarray|paddle.Tensor) - 输入的图像数据，数据格式为'HWC'。
-    - output (PIL.Image|np.ndarray|Paddle.Tensor) - 返回归一化后的图像数据。
-
 返回
 :::::::::
 
-    计算 ``Normalize`` 的可调用对象。
+    ``numpy array 或 paddle.Tensor``，归一化后的图像。
 
 代码示例
 :::::::::
@@ -33,17 +28,15 @@ Normalize
 
     import numpy as np
     from PIL import Image
-    from paddle.vision.transforms import Normalize
+    from paddle.vision.transforms import functional as F
+
+    fake_img = (np.random.rand(256, 300, 3) * 255.).astype('uint8')
 
-    normalize = Normalize(mean=[127.5, 127.5, 127.5],
-                          std=[127.5, 127.5, 127.5],
-                          data_format='HWC')
+    fake_img = Image.fromarray(fake_img)
 
-    fake_img = Image.fromarray((np.random.rand(300, 320, 3) * 255.).astype(np.uint8))
+    mean = [127.5, 127.5, 127.5]
+    std = [127.5, 127.5, 127.5]
 
-    fake_img = normalize(fake_img)
-    print(fake_img.shape)
-    # (300, 320, 3)
-    print(fake_img.max(), fake_img.min())
+    normalized_img = F.normalize(fake_img, mean, std, data_format='HWC')
+    print(normalized_img.max(), normalized_img.min())
     # 0.99215686 -1.0
-