Skip to content

Commit 30444a3

Browse files
authored
Deadlink fix (#654)
* fix_deadlinks * update_docker * Update release_note.rst
1 parent 60d7180 commit 30444a3

File tree

24 files changed

+124
-129
lines changed

24 files changed

+124
-129
lines changed

doc/fluid/advanced_usage/deploy/mobile/mobile_readme.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
## Features
1010

11-
- 高性能支持ARM CPU
11+
- 高性能支持ARM CPU
1212
- 支持Mali GPU
1313
- 支持Andreno GPU
1414
- 支持苹果设备的GPU Metal实现
@@ -55,7 +55,7 @@
5555

5656
### 2. Caffe转为Paddle Fluid模型
5757

58-
请参考这里[这里](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/image_classification/caffe2fluid)
58+
请参考这里[这里](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/caffe2fluid)
5959

6060
### 3. ONNX
6161

@@ -78,5 +78,5 @@ Paddle-Mobile 提供相对宽松的Apache-2.0开源协议 [Apache-2.0 license](L
7878

7979

8080
## 旧版 Mobile-Deep-Learning
81-
原MDL(Mobile-Deep-Learning)工程被迁移到了这里 [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)
81+
原MDL(Mobile-Deep-Learning)工程被迁移到了这里 [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)
8282

doc/fluid/advanced_usage/deploy/mobile/mobile_readme_en.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Welcome to Paddle-Mobile GitHub project. Paddle-Mobile is a project of PaddlePad
88

99
## Features
1010

11-
- high performance in support of ARM CPU
11+
- high performance in support of ARM CPU
1212
- support Mali GPU
1313
- support Andreno GPU
1414
- support the realization of GPU Metal on Apple devices
@@ -50,7 +50,7 @@ At present Paddle-Mobile only supports models trained by Paddle fluid. Models ca
5050
### 1. Use Paddle Fluid directly to train
5151
It is the most reliable method to be recommended
5252
### 2. Transform Caffe to Paddle Fluid model
53-
[https://github.com/PaddlePaddle/models/tree/develop/fluid/image_classification/caffe2fluid](https://github.com/PaddlePaddle/models/tree/develop/fluid/image_classification/caffe2fluid)
53+
[https://github.com/PaddlePaddle/models/tree/develop/fluid/image_classification/caffe2fluid](https://github.com/PaddlePaddle/models/tree/develop/fluid/PaddleCV/caffe2fluid)
5454
### 3. ONNX
5555
ONNX is the acronym of Open Neural Network Exchange. The project is aimed to make a full communication and usage among different neural network development frameworks.
5656

@@ -76,4 +76,4 @@ Paddle-Mobile provides relatively unstrict Apache-2.0 Open source agreement [Apa
7676

7777

7878
## Old version Mobile-Deep-Learning
79-
Original MDL(Mobile-Deep-Learning) project has been transferred to [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)
79+
Original MDL(Mobile-Deep-Learning) project has been transferred to [Mobile-Deep-Learning](https://github.com/allonli/mobile-deep-learning)

doc/fluid/advanced_usage/development/contribute_to_paddle/local_dev_guide.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626

2727
## 创建本地分支
2828

29-
Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发,测试,发行和维护,具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/releasing_process.md#paddle-分支规范)
29+
Paddle 目前使用[Git流分支模型](http://nvie.com/posts/a-successful-git-branching-model/)进行开发,测试,发行和维护,具体请参考 [Paddle 分支规范](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/others/releasing_process.md)
3030

3131
所有的 feature 和 bug fix 的开发工作都应该在一个新的分支上完成,一般从 `develop` 分支上创建新分支。
3232

@@ -110,7 +110,7 @@ no changes added to commit (use "git add" and/or "git commit -a")
110110
➜ docker run -it -v $(pwd):/paddle paddle:latest-dev bash -c "cd /paddle/build && ctest"
111111
```
112112

113-
关于构建和测试的更多信息,请参见[使用Docker安装运行](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/v2/build_and_install/docker_install_cn.rst)
113+
关于构建和测试的更多信息,请参见[使用Docker安装运行](../../../beginners_guide/install/install_Docker.html)
114114

115115
## 提交(commit)
116116

doc/fluid/advanced_usage/development/contribute_to_paddle/local_dev_guide_en.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
You will learn how to develop programs in local environment under the guidelines of this document.
44

55
## Requirements of coding
6-
- Please refer to the coding comment format of [Doxygen](http://www.stack.nl/~dimitri/doxygen/)
6+
- Please refer to the coding comment format of [Doxygen](http://www.stack.nl/~dimitri/doxygen/)
77
- Make sure that option of builder `WITH_STYLE_CHECK` is on and the build could pass through the code style check.
88
- Unit test is needed for all codes.
99
- Pass through all unit tests.
@@ -26,7 +26,7 @@ Clone remote git to local:
2626

2727
## Create local branch
2828

29-
At present [Git stream branch model](http://nvie.com/posts/a-successful-git-branching-model/) is applied to Paddle to undergo task of development,test,release and maintenance.Please refer to [branch regulation of Paddle](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/design/releasing_process.md#paddle-分支规范) about details。
29+
At present [Git stream branch model](http://nvie.com/posts/a-successful-git-branching-model/) is applied to Paddle to undergo task of development,test,release and maintenance.Please refer to [branch regulation of Paddle](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/others/releasing_process.md) about details。
3030

3131
All development tasks of feature and bug fix should be finished in a new branch which is extended from `develop` branch.
3232

@@ -80,7 +80,7 @@ no changes added to commit (use "git add" and/or "git commit -a")
8080

8181
It needs a variety of development tools to build PaddlePaddle source code and generate documentation. For convenience, our standard development procedure is to put these tools together into a Docker image,called *development mirror* , usually named as `paddle:latest-dev` or `paddle:[version tag]-dev`,such as `paddle:0.11.0-dev` . Then all that need `cmake && make` ,such as IDE configuration,are replaced by `docker run paddle:latest-dev` .
8282

83-
You need to bulid this development mirror under the root directory of source code directory tree
83+
You need to bulid this development mirror under the root directory of source code directory tree
8484

8585
```bash
8686
➜ docker build -t paddle:latest-dev .
@@ -110,7 +110,7 @@ Run all unit tests with following commands:
110110
➜ docker run -it -v $(pwd):/paddle paddle:latest-dev bash -c "cd /paddle/build && ctest"
111111
```
112112

113-
Please refer to [Installation and run with Docker](https://github.com/PaddlePaddle/Paddle/blob/develop/doc/v2/build_and_install/docker_install_cn.rst) about more information of construction and test.
113+
Please refer to [Installation and run with Docker](../../../beginners_guide/install/install_Docker.html) about more information of construction and test.
114114

115115
## Commit
116116

doc/fluid/advanced_usage/development/new_op/index_cn.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,9 @@
22
新增operator
33
#############
44

5-
- `如何写新的operator <../../../advanced_usage/development/new_op.html>`_ :介绍如何在 Fluid 中添加新的 Operator
6-
75
- `op相关的一些注意事项 <../../../advanced_usage/development/op_notes.html>`_ :介绍op相关的一些注意事项
86

97
.. toctree::
108
:hidden:
119

12-
new_op_cn.md
1310
op_notes.md

doc/fluid/advanced_usage/development/new_op/op_notes.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@ Op的核心方法是Run,Run方法需要两方面的资源:数据资源和计
88

99
Fluid框架的设计理念是可以在多种设备及第三方库上运行,有些Op的实现可能会因为设备或者第三方库的不同而不同。为此,Fluid引入了OpKernel的方式,即一个Op可以有多个OpKernel,这类Op继承自`OperatorWithKernel`,这类Op的代表是conv,conv_op的OpKerne有:`GemmConvKernel``CUDNNConvOpKernel``ConvMKLDNNOpKernel`,且每个OpKernel都有double和float两种数据类型。不需要OpKernel的代表有`WhileOp`等。
1010

11-
Operator继承关系图:
11+
Operator继承关系图:
1212
![op_inheritance_relation_diagram](../../pics/op_inheritance_relation_diagram.png)
1313

14-
进一步了解可参考:[multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices)[scope](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/scope.md)[Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
14+
进一步了解可参考:[multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices)[scope](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/design/concepts/scope.md)[Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/release/1.2/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
1515

1616
### 2.Op的注册逻辑
1717
每个Operator的注册项包括:
@@ -75,15 +75,15 @@ Operator继承关系图:
7575

7676
通常Op注释时需要调用REGISTER_OPERATOR,即:
7777
```
78-
REGISTER_OPERATOR(op_type,
78+
REGISTER_OPERATOR(op_type,
7979
OperatorBase
8080
op_maker_and_checker_maker,
8181
op_grad_opmaker,
8282
op_infer_var_shape,
8383
op_infer_var_type)
8484
```
8585

86-
**注意:**
86+
**注意:**
8787

8888
1. 对于所有Op,前三个参数是必须的,op_type指明op的名字,OperatorBase是该Op的对象,op_maker_and_checker_maker是op的maker和op中attr的checker。
8989
2. 如果该Op有反向,则必须要有op_grad_opmaker,因为在backward会根据正向的Op中获取反向Op的Maker。
@@ -139,7 +139,7 @@ The following device operations are asynchronous with respect to the host:
139139
- 如果数据传输是从GPU端到非页锁定的CPU端,数据传输将是同步,即使调用的是异步拷贝操作。
140140
- 如果数据传输时从CPU端到CPU端,数据传输将是同步的,即使调用的是异步拷贝操作。
141141

142-
更多内容可参考:[Asynchronous Concurrent Execution](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#asynchronous-concurrent-execution)[API synchronization behavior](https://docs.nvidia.com/cuda/cuda-runtime-api/api-sync-behavior.html#api-sync-behavior)
142+
更多内容可参考:[Asynchronous Concurrent Execution](https://docs.nvidia.com/cuda/cuda-c-programming-guide/#asynchronous-concurrent-execution)[API synchronization behavior](https://docs.nvidia.com/cuda/cuda-runtime-api/api-sync-behavior.html#api-sync-behavior)
143143

144144
## Op性能优化
145145
### 1.第三方库的选择

doc/fluid/advanced_usage/development/new_op/op_notes_en.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ The Fluid framework is designed to run on a variety of devices and third-party l
1111
Operator inheritance diagram:
1212
![op_inheritance_relation_diagram](../../pics/op_inheritance_relation_diagram.png)
1313

14-
For further information, please refer to: [multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices) , [scope](https://github.com/PaddlePaddle/FluidDoc/Blob/develop/doc/fluid/design/concepts/scope.md) , [Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md )
14+
For further information, please refer to: [multi_devices](https://github.com/PaddlePaddle/FluidDoc/tree/develop/doc/fluid/design/multi_devices) , [scope](https://github.com/PaddlePaddle/FluidDoc/Blob/develop/doc/fluid/design/concepts/scope.md) , [Developer's_Guide_to_Paddle_Fluid](https://github.com/PaddlePaddle/FluidDoc/blob/release/1.2/doc/fluid/getstarted/Developer's_Guide_to_Paddle_Fluid.md)
1515

1616
### 2.Op's registration logic
1717
The registration entries for each Operator include:
@@ -67,7 +67,7 @@ The registration entries for each Operator include:
6767
<tr>
6868
<td>OpCreator </td>
6969
<td>Functor </td>
70-
<td>Create a new OperatorBase for each call </td>
70+
<td>Create a new OperatorBase for each call </td>
7171
<td>Call at runtime </td>
7272
</tr>
7373
</tbody>
@@ -150,7 +150,7 @@ The calculation speed of Op is related to the amount of data input. For some Op,
150150

151151
Since the call of CUDA Kernel has a certain overhead, multiple calls of the CUDA Kernel in Op may affect the execution speed of Op. For example, the previous sequence_expand_op contains many CUDA Kernels. Usually, these CUDA Kernels process a small amount of data, so frequent calls to such Kernels will affect the calculation speed of Op. In this case, it is better to combine these small CUDA Kernels into one. This idea is used in the optimization of the sequence_expand_op procedure (related PR[#9289](https://github.com/PaddlePaddle/Paddle/pull/9289)). The optimized sequence_expand_op is about twice as fast as the previous implementation, the relevant experiments are introduced in the PR ([#9289](https://github.com/PaddlePaddle/Paddle/pull/9289)).
152152

153-
Reduce the number of copy and sync operations between the CPU and the GPU. For example, the fetch operation will update the model parameters and get a loss after each iteration, and the copy of the data from the GPU to the Non-Pinned-Memory CPU is synchronous, so frequent fetching for multiple parameters will reduce the model training speed.
153+
Reduce the number of copy and sync operations between the CPU and the GPU. For example, the fetch operation will update the model parameters and get a loss after each iteration, and the copy of the data from the GPU to the Non-Pinned-Memory CPU is synchronous, so frequent fetching for multiple parameters will reduce the model training speed.
154154

155155
## Op numerical stability
156156
### 1. Some Ops have numerical stability problems

doc/fluid/advanced_usage/development/profiling/host_memory_profiling_cn.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ gperftool主要支持以下四个功能:
1616
- heap-profiling using tcmalloc
1717
- CPU profiler
1818

19-
Paddle也提供了基于gperftool的[CPU性能分析教程](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/howto/optimization/cpu_profiling_cn.md)
19+
Paddle也提供了基于gperftool的[CPU性能分析教程](./cpu_profiling_cn.html)
2020

2121
对于堆内存的分析,主要用到thread-caching malloc和heap-profiling using tcmalloc。
2222

@@ -29,7 +29,7 @@ Paddle也提供了基于gperftool的[CPU性能分析教程](https://github.com/P
2929
- 安装google-perftools
3030

3131
```
32-
apt-get install libunwind-dev
32+
apt-get install libunwind-dev
3333
apt-get install google-perftools
3434
```
3535

@@ -73,17 +73,17 @@ env HEAPPROFILE="./perf_log/test.log" HEAP_PROFILE_ALLOCATION_INTERVAL=209715200
7373
pprof --pdf python test.log.0012.heap
7474
```
7575
上述命令会生成一个profile00x.pdf的文件,可以直接打开,例如:[memory_cpu_allocator](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_cpu_allocator.pdf)。从下图可以看出,在CPU版本fluid的运行过程中,分配存储最多的模块式CPUAllocator. 而别的模块相对而言分配内存较少,所以被忽略了,这对于分配内存泄漏是很不方便的,因为泄漏是一个缓慢的过程,在这种图中是无法看到的。
76-
76+
7777
![result](https://user-images.githubusercontent.com/3048612/40964027-a54033e4-68dc-11e8-836a-144910c4bb8c.png)
78-
78+
7979
- Diff模式。可以对两个时刻的heap做diff,把一些内存分配没有发生变化的模块去掉,而把增量部分显示出来。
8080
```
8181
pprof --pdf --base test.log.0010.heap python test.log.1045.heap
8282
```
8383
生成的结果为:[`memory_leak_protobuf`](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_leak_protobuf.pdf)
84-
84+
8585
从图中可以看出:ProgramDesc这个结构,在两个版本之间增长了200MB+,所以这里有很大的内存泄漏的可能性,最终结果也确实证明是这里造成了泄漏。
86-
86+
8787
![result](https://user-images.githubusercontent.com/3048612/40964057-b434d5e4-68dc-11e8-894b-8ab62bcf26c2.png)
8888
![result](https://user-images.githubusercontent.com/3048612/40964063-b7dbee44-68dc-11e8-9719-da279f86477f.png)
89-
89+

doc/fluid/advanced_usage/development/profiling/host_memory_profiling_en.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ gperftool mainly supports four functions:
1616
- heap-profiling using tcmalloc
1717
- CPU profiler
1818

19-
Paddle also provides a [tutorial on CPU performance analysis](https://github.com/PaddlePaddle/FluidDoc/blob/develop/doc/fluid/howto/optimization/cpu_profiling_en.md) based on gperftool.
19+
Paddle also provides a [tutorial on CPU performance analysis](./cpu_profiling_en.html) based on gperftool.
2020

2121
For the analysis for heap, we mainly use thread-caching malloc and heap-profiling using tcmalloc.
2222

@@ -29,7 +29,7 @@ This tutorial is based on the Docker development environment paddlepaddle/paddle
2929
- Install google-perftools
3030

3131
```
32-
apt-get install libunwind-dev
32+
apt-get install libunwind-dev
3333
apt-get install google-perftools
3434
```
3535

@@ -74,15 +74,15 @@ As the program runs, a lot of files will be generated in the perf_log folder as
7474
```
7575
The command above will generate a file of profile00x.pdf, which can be opened directly, for example, [memory_cpu_allocator](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_cpu_allocator.pdf). As demonstrated in the chart below, during the running of the CPU version fluid, the module CPUAllocator is allocated with most memory. Other modules are allocated with relatively less memory, so they are ignored. It is very inconvenient for inspecting memory leak for memory leak is a chronic process which cannot be inspected in this picture.
7676
![result](https://user-images.githubusercontent.com/3048612/40964027-a54033e4-68dc-11e8-836a-144910c4bb8c.png)
77-
77+
7878
- Diff mode. You can do diff on the heap at two moments, which removes some modules whose memory allocation has not changed, and displays the incremental part.
7979
```
8080
pprof --pdf --base test.log.0010.heap python test.log.1045.heap
8181
```
8282
The generated result: [`memory_leak_protobuf`](https://github.com/jacquesqiao/Paddle/blob/bd2ea0e1f84bb6522a66d44a072598153634cade/doc/fluid/howto/optimization/memory_leak_protobuf.pdf)
83-
83+
8484
As shown from the figure: The structure of ProgramDesc has increased by 200MB+ between the two versions, so there is a large possibility that memory leak happens here, and the final result does prove a leak here.
85-
85+
8686
![result](https://user-images.githubusercontent.com/3048612/40964057-b434d5e4-68dc-11e8-894b-8ab62bcf26c2.png)
8787
![result](https://user-images.githubusercontent.com/3048612/40964063-b7dbee44-68dc-11e8-9719-da279f86477f.png)
88-
88+

doc/fluid/advanced_usage/development/profiling/index_cn.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,7 @@
1212

1313
本模块介绍 Fluid 使用过程中的调优方法,包括:
1414

15-
- `如何进行基准测试 <benchmark.html>`_:介绍如何选择基准模型,从而验证模型的精度和性能
1615
- `CPU性能调优 <cpu_profiling_cn.html>`_:介绍如何使用 cProfile 包、yep库、Google perftools 进行性能分析与调优
17-
- `GPU性能调优 <gpu_profiling_cn.html>`_:介绍如何使用 Fluid 内置的定时工具、nvprof 或 nvvp 进行性能分析和调优
1816
- `堆内存分析和优化 <host_memory_profiling_cn.html>`_:介绍如何使用 gperftool 进行堆内存分析和优化,以解决内存泄漏的问题
1917
- `Timeline工具简介 <timeline_cn.html>`_ :介绍如何使用 Timeline 工具进行性能分析和调优
2018

0 commit comments

Comments
 (0)