-
Notifications
You must be signed in to change notification settings - Fork 875
add api_guides low_level backward parameter program_en #696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
307a66c
add api_guides low_level backward parameter program_en
zy0531 c9f1c1f
Apply suggestions from code review
haowang101779990 4b01c92
Apply suggestions from code review
haowang101779990 cf1febc
Update backward_en.rst
zy0531 c389cab
Update parameter_en.rst
zy0531 e0a5465
Update program_en.rst
zy0531 ac3955e
Update doc/fluid/api_guides/low_level/program_en.rst
zy0531 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| .. _api_guide_backward_en: | ||
|
|
||
|
|
||
| ################ | ||
| Back Propagation | ||
| ################ | ||
|
|
||
| The ability of neural network to define model depends on optimization algorithm. Optimization is a process of calculating gradient continuously and adjusting learnable parameters. You can refer to :ref:`api_guide_optimizer_en` to learn more about optimization algorithm in Fluid. | ||
|
|
||
| In the training process of network, gradient calculation is divided into two steps: forward computing and `back propagation <https://en.wikipedia.org/wiki/Backpropagation>`_ . | ||
|
|
||
| Forward computing transfers the state of the input unit to the output unit according to the network structure you build. | ||
|
|
||
| Back propagation calculates the derivatives of two or more compound functions by means of `chain rule <https://en.wikipedia.org/wiki/Chain_rule>`_ . The gradient of output unit is propagated back to input unit. According to the calculated gradient, the learning parameters of the network are adjusted. | ||
|
|
||
|
|
||
| You could refer to `back propagation algorithm <http://deeplearning.stanford.edu/wiki/index.php/%E5%8F%8D%E5%90%91%E4%BC%A0%E5%AF%BC%E7%AE%97%E6%B3%95>`_ for detialed implementation process. | ||
|
|
||
| We do not recommend directly calling backpropagation-related APIs in :code:`fluid` , as these are very low-level APIs. Consider using the relevant APIs in :ref:`api_guide_optimizer_en` instead. When you use optimizer APIs, Fluid automatically calculates the complex back-propagation for you. | ||
|
|
||
| If you want to implement it by yourself, you can also use: :code:`callback` in :ref:`api_fluid_backward_append_backward` to define the customized gradient form of Operator. | ||
| For more information, please refer to: :ref:`api_fluid_backward_append_backward` | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,175 @@ | ||
| .. _api_guide_parameter_en: | ||
|
|
||
| ################## | ||
| Model Parameters | ||
| ################## | ||
|
|
||
| Model parameters are weights and biases in a model. In fluid, they are instances of ``fluid.Parameter`` class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by :ref:`api_fluid_ParamAttr` . The configurable contents are as follows: | ||
|
|
||
|
|
||
| - Initialization method | ||
|
|
||
| - Regularization | ||
|
|
||
| - gradient clipping | ||
|
|
||
| - Model Average | ||
|
|
||
|
|
||
|
|
||
| Initialization method | ||
| ======================== | ||
|
|
||
| Fluid initializes a single parameter by setting attributes of :code:`initializer` in :code:`ParamAttr` . | ||
|
|
||
| examples: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| param_attrs = fluid.ParamAttr(name="fc_weight", | ||
| initializer=fluid.initializer.ConstantInitializer(1.0)) | ||
| y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs) | ||
|
|
||
|
|
||
|
|
||
| The following is the initialization method supported by fluid: | ||
|
|
||
| 1. BilinearInitializer | ||
| ----------------------- | ||
|
|
||
| Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation. | ||
|
|
||
| Alias:Bilinear | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_BilinearInitializer` | ||
|
|
||
| 2. ConstantInitializer | ||
| -------------------------- | ||
|
|
||
| Constant initialization. Initialize the parameter to the specified value. | ||
|
|
||
| Alias:Constant | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_ConstantInitializer` | ||
|
|
||
| 3. MSRAInitializer | ||
| ---------------------- | ||
|
|
||
| Please refer to https://arxiv.org/abs/1502.01852 for initialization. | ||
|
|
||
| Alias:MSRA | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_MSRAInitializer` | ||
|
|
||
| 4. NormalInitializer | ||
| ------------------------- | ||
|
|
||
| Initialization method of random Gaussian distribution. | ||
|
|
||
| Alias:Normal | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_NormalInitializer` | ||
|
|
||
| 5. TruncatedNormalInitializer | ||
| --------------------------------- | ||
|
|
||
| Initialization method of stochastic truncated Gauss distribution. | ||
|
|
||
| Alias:TruncatedNormal | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_TruncatedNormalInitializer` | ||
|
|
||
| 6. UniformInitializer | ||
| ------------------------ | ||
|
|
||
| Initialization method of random uniform distribution. | ||
|
|
||
| Alias:Uniform | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_UniformInitializer` | ||
|
|
||
| 7. XavierInitializer | ||
| ------------------------ | ||
|
|
||
| Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization. | ||
|
|
||
| Alias:Xavier | ||
|
|
||
| API reference: :ref:`api_fluid_initializer_XavierInitializer` | ||
|
|
||
| Regularization | ||
| ================= | ||
|
|
||
| Fluid regularizes a single parameter by setting attributes of :code:`regularizer` in :code:`ParamAttr` . | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| param_attrs = fluid.ParamAttr(name="fc_weight", | ||
| regularizer=fluid.regularizer.L1DecayRegularizer(0.1)) | ||
| y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs) | ||
|
|
||
| The following is the regularization approach supported by fluid: | ||
|
|
||
| - :ref:`api_fluid_regularizer_L1DecayRegularizer` (Alias:L1Decay) | ||
| - :ref:`api_fluid_regularizer_L2DecayRegularizer` (Alias:L2Decay) | ||
|
|
||
| Clipping | ||
| ========== | ||
|
|
||
| Fluid sets clipping method for a single parameter by setting attributes of :code:`gradient_clip` in :code:`ParamAttr` . | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| param_attrs = fluid.ParamAttr(name="fc_weight", | ||
| regularizer=fluid.regularizer.L1DecayRegularizer(0.1)) | ||
| y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs) | ||
|
|
||
|
|
||
|
|
||
| The following is the clipping method supported by fluid: | ||
|
|
||
| 1. ErrorClipByValue | ||
| ---------------------- | ||
|
|
||
| Used to clipping the value of a tensor to a specified range. | ||
|
|
||
| API reference: :ref:`api_fluid_clip_ErrorClipByValue` | ||
|
|
||
| 2. GradientClipByGlobalNorm | ||
| ------------------------------ | ||
|
|
||
| Used to limit the global-norm of multiple Tensors to :code:`clip_norm`. | ||
|
|
||
| API reference: :ref:`api_fluid_clip_GradientClipByGlobalNorm` | ||
|
|
||
| 3. GradientClipByNorm | ||
| ------------------------ | ||
| Limit the L2-norm of Tensor to :code:`max_norm` . If Tensor's L2-norm exceeds: :code:`max_norm` , | ||
| it will calculate a :code:`scale` . And then all values of the Tensor multiply the :code:`scale` . | ||
|
|
||
| API reference: :ref:`api_fluid_clip_GradientClipByNorm` | ||
|
|
||
| 4. GradientClipByValue | ||
| ------------------------- | ||
|
|
||
| Limit the value of the gradient on a parameter to [min, max]. | ||
|
|
||
| API reference: :ref:`api_fluid_clip_GradientClipByValue` | ||
|
|
||
| Model Averaging | ||
| ================ | ||
|
|
||
| Fluid determines whether to average a single parameter by setting attributes of :code:`do_model_average` in :code:`ParamAttr` . | ||
| Examples: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| param_attrs = fluid.ParamAttr(name="fc_weight", | ||
| do_model_average=true) | ||
| y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs) | ||
|
|
||
| In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates. | ||
|
|
||
| The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process. | ||
|
|
||
| API reference :ref:`api_fluid_optimizer_ModelAverage` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| .. _api_guide_Program_en: | ||
|
|
||
| ############################### | ||
| Program/Block/Operator/Variable | ||
| ############################### | ||
|
|
||
| ================== | ||
| Program | ||
| ================== | ||
|
|
||
| :code:`Fluid` describes neural network configuration in the form of abstract grammar tree similar to that of a programming language, and the user's description of computation will be written into a Program. Program in Fluid replaces the concept of models in traditional frameworks. It can describe any complex model through three execution structures: sequential execution, conditional selection and loop execution. Writing :code:`Program` is very close to writing a common program. If you have tried programming before, you will naturally apply your expertise to it. | ||
|
|
||
| In brief: | ||
|
|
||
| * A model is a Fluid :code:`Program` and can contain more than one :code:`Program` ; | ||
|
|
||
| * :code:`Program` consists of nested :code:`Block` , and the concept of :code:`Block` can be analogized to a pair of braces in C++ or Java, or an indentation block in Python. | ||
|
|
||
|
|
||
| * Computing in :code:`Block` is composed of three ways: sequential execution, conditional selection or loop execution, which constitutes complex computational logic. | ||
|
|
||
|
|
||
| * :code:`Block` contains descriptions of computation and computational objects. The description of computation is called Operator; the object of computation (or the input and output of Operator) is unified as Tensor. In Fluid, Tensor is represented by 0-leveled `LoD-Tensor <http://paddlepaddle.org/documentation/docs/zh/1.2/user_guides/howto/prepare_data/lod_tensor.html#permalink-4-lod-tensor>`_ . | ||
|
|
||
|
|
||
| ========= | ||
| Block | ||
| ========= | ||
|
|
||
| :code:`Block` is the concept of variable scope in advanced languages. In programming languages, Block is a pair of braces, which contains local variable definitions and a series of instructions or operators. Control flow structures :code:`if-else` and :code:`for` in programming languages can be equivalent to the following counterparts in deep learning: | ||
|
|
||
| +----------------------+-------------------------+ | ||
| | programming languages| Fluid | | ||
| +======================+=========================+ | ||
| | for, while loop | RNN,WhileOP | | ||
| +----------------------+-------------------------+ | ||
| | if-else, switch | IfElseOp, SwitchOp | | ||
| +----------------------+-------------------------+ | ||
| | execute sequentially | a series of layers | | ||
| +----------------------+-------------------------+ | ||
|
|
||
| As mentioned above, :code:`Block` in Fluid describes a set of Operators that include sequential execution, conditional selection or loop execution, and the operating object of Operator: Tensor. | ||
|
|
||
|
|
||
|
|
||
| ============= | ||
| Operator | ||
| ============= | ||
|
|
||
| In Fluid, all operations of data are represented by :code:`Operator` . In Python, :code:`Operator` in Fluid is encapsulated into modules like :code:`paddle.fluid.layers` , :code:`paddle.fluid.nets` . | ||
|
|
||
| This is because some common operations on Tensor may consist of more basic operations. For simplicity, some encapsulation of the basic Operator is carried out inside the framework, including the creation of learnable parameters relied by an Operator, the initialization details of learnable parameters, and so on, so as to reduce the cost of further development. | ||
|
|
||
|
|
||
|
|
||
| More information can be read for reference. `Fluid Design Idea <../../advanced_usage/design_idea/fluid_design_idea.html>`_ | ||
|
|
||
|
|
||
| ========= | ||
| Variable | ||
| ========= | ||
|
|
||
| In Fluid, :code:`Variable` can contain any type of value -- in most cases a LoD-Tensor. | ||
|
|
||
| All the learnable parameters in the model are kept in the memory space in form of :code:`Variable` . In most cases, you do not need to create the learnable parameters in the network by yourself. Fluid provides encapsulation for almost common basic computing modules of the neural network. Taking the simplest full connection model as an example, calling :code:`fluid.layers.fc` directly creates two learnable parameters for the full connection layer, namely, connection weight (W) and bias, without explicitly calling :code:`Variable` related interfaces to create learnable parameters. | ||
|
|
||
| ================== | ||
| Related API | ||
| ================== | ||
|
|
||
|
|
||
| * A single neural network configured by the user is called :ref:`api_fluid_Program` . It is noteworthy that when training neural networks, users often need to configure and operate multiple :code:`Program` . For example, :code:`Program` for parameter initialization, :code:`Program` for training, :code:`Program` for testing, etc. | ||
|
|
||
|
|
||
| * Users can also use :ref:`api_fluid_program_guard` with :code:`with` to modify the configured :ref:`api_fluid_default_startup_program` and :ref:`api_fluid_default_main_program` . | ||
|
|
||
|
|
||
| * In Fluid,the execution order in a Block is determined by control flow,such as :ref:`api_fluid_layers_IfElse` , :ref:`api_fluid_layers_While` and :ref:`api_fluid_layers_Switch` . For more information, please refer to: :ref:`api_guide_control_flow_en` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
正常情况下,段落与段落间空一行即可