From 38bcc3d1d29503fee849a8de52425c6e3e92bcd8 Mon Sep 17 00:00:00 2001 From: zhangkeliang Date: Fri, 29 Apr 2022 09:04:08 +0800 Subject: [PATCH 1/5] Add custom device en docs --- .../custom_device_example_en.md | 462 ++++++++++++++++++ .../custom_kernel_docs/context_api_en.md | 150 ++++++ .../custom_kernel_docs/cpp_api_en.rst | 20 + .../custom_kernel_docs/exception_api_en.md | 62 +++ .../custom_kernel_docs/kernel_declare_en.md | 83 ++++ .../custom_kernel_docs/register_api_en.md | 62 +++ .../custom_kernel_docs/tensor_api_en.md | 190 +++++++ .../custom_device_docs/custom_kernel_en.rst | 19 + .../custom_device_docs/custom_runtime_en.rst | 144 ++++++ .../custom_device_docs/device_api_en.md | 185 +++++++ .../custom_device_docs/event_api_en.md | 93 ++++ .../custom_device_docs/index_en.rst | 19 + .../custom_device_docs/memory_api_en.md | 459 +++++++++++++++++ .../runtime_data_type_en.md | 131 +++++ .../custom_device_docs/stream_api_en.md | 115 +++++ docs/dev_guides/index_en.rst | 1 + docs/index_en.rst | 1 + 17 files changed, 2196 insertions(+) create mode 100644 docs/dev_guides/custom_device_docs/custom_device_example_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/context_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/exception_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/kernel_declare_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_docs/tensor_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/custom_kernel_en.rst create mode 100644 docs/dev_guides/custom_device_docs/custom_runtime_en.rst create mode 100644 docs/dev_guides/custom_device_docs/device_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/event_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/index_en.rst create mode 100644 docs/dev_guides/custom_device_docs/memory_api_en.md create mode 100644 docs/dev_guides/custom_device_docs/runtime_data_type_en.md create mode 100644 docs/dev_guides/custom_device_docs/stream_api_en.md diff --git a/docs/dev_guides/custom_device_docs/custom_device_example_en.md b/docs/dev_guides/custom_device_docs/custom_device_example_en.md new file mode 100644 index 00000000000..6331865486e --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_device_example_en.md @@ -0,0 +1,462 @@ +# Example of Device Access + +This section will talk about how to implement a CustomDevice plug-in and add a new device backend for PaddlePaddle. How to compile, package, install, and use the backend will also be introduced. + +> Note: +> - Please make sure that you have correctly installed the latest version of [Paddle develop](https://github.com/PaddlePaddle/Paddle). +> - Only `Linux` is supported +> - PaddlePaddle can have the custom kernel code and registration of open functional statements in heder files. + +## Step One: Implement Custom Runtime + +**InitPlugin** + +As a custom runtime entry function, InitPlugin is required to be implemented by the plug-in. The parameter in InitPlugin should also be checked, device information should be filled in, and the runtime API should be registered. In the initialization, PaddlePaddle loads the plug-in and invokes InitPlugin to initialize it, and register runtime (The whole process can be done automatically by the framework, only if the dynamic-link library is in site-packages/paddle-plugins/ or the designated directory of the enviornment variable of CUSTOM_DEVICE_ROOT). + +Example: + +```c++ +#include "paddle/phi/backends/device_ext.h" + +void InitPlugin(CustomRuntimeParams *params) { + // Check compatibility of the version and fill in the information of the custom runtime version used by the plug-in. + PADDLE_CUSTOM_RUNTIME_CHECK_VERSION(params); + + // Fill in the basic runtime information + params->device_type = "CustomCPU"; + params->sub_device_type = "V1"; + + // Register the Runtime API + params->interface->set_device = set_device; + params->interface->get_device = get_device; + params->interface->create_stream = create_stream; + params->interface->destroy_stream = destroy_stream; + params->interface->create_event = create_event; + params->interface->destroy_event = destroy_event; + params->interface->record_event = record_event; + params->interface->synchronize_device = sync_device; + params->interface->synchronize_stream = sync_stream; + params->interface->synchronize_event = sync_event; + params->interface->stream_wait_event = stream_wait_event; + params->interface->memory_copy_h2d = memory_copy; + params->interface->memory_copy_d2d = memory_copy; + params->interface->memory_copy_d2h = memory_copy; + params->interface->device_memory_allocate = allocate; + params->interface->device_memory_deallocate = deallocate; + params->interface->get_device_count = get_device_count; + params->interface->get_device_list = get_device_list; + params->interface->device_memory_stats = memstats; + params->interface->device_min_chunk_size = get_min_chunk_size; +} +``` + +The plug-in should first check the parameters of InitPlugin, and the framework should set the size to an optimal value and sent it to InitPlugin. For types of CustomRuntimeParams and C_DeviceInterface, please refer to[device_ext.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/backends/device_ext.h). + +Then, the plug-in should fill in its basic information and version number, which can be helpful for PaddlePaddle to manage the plug-in and check the version compatibility. + +- params->size and params->interface.size : In the following custom runtime versions, the size and the interface will rank the first and the second respectively in all types of CustomRuntimeParams. +- params->version : Information of the plug-in version is filled in. The definition of the version number can be found in [device_ext.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/backends/device_ext.h). And PaddlePaddle checks the version compatibility in the registration of custom runtime. +- params->device_type : the appellation of the device backend. If there is another plug-in with the same name, the runtime will not be registered. +- params->sub_device_type : the appellation of the sub-type of the device backend + +Finally, some callback APIs in params->interface should be filled by the plug-in (At least the required APIs should be implemented, or the runtime will not be registered otherwise). Thus, the custom runtime can be initialized. For details of the APIS, please refer to [Custom Runtime Document](./custom_runtime_cn.html)。 + +```c++ +static size_t global_total_mem_size = 1 * 1024 * 1024 * 1024UL; +static size_t global_free_mem_size = global_total_mem_size; + +C_Status set_device(const C_Device device) { + return C_SUCCESS; +} + +C_Status get_device(const C_Device device) { + device->id = 0; + return C_SUCCESS; +} + +C_Status get_device_count(size_t *count) { + *count = 1; + return C_SUCCESS; +} + +C_Status get_device_list(size_t *device) { + *device = 0; + return C_SUCCESS; +} + +C_Status memory_copy(const C_Device device, void *dst, const void *src, size_t size) { + memcpy(dst, src, size); + return C_SUCCESS; +} + +C_Status allocate(const C_Device device, void **ptr, size_t size) { + if (size > global_free_mem_size) { + return C_FAILED; + } + global_free_mem_size -= size; + *ptr = malloc(size); + return C_SUCCESS; +} + +C_Status deallocate(const C_Device device, void *ptr, size_t size) { + if (!ptr) { + return C_FAILED; + } + global_free_mem_size += size; + free(ptr); + return C_SUCCESS; +} + +C_Status create_stream(const C_Device device, C_Stream *stream) { + stream = nullptr; + return C_SUCCESS; +} + +C_Status destroy_stream(const C_Device device, C_Stream stream) { + return C_SUCCESS; +} + +C_Status create_event(const C_Device device, C_Event *event) { + return C_SUCCESS; +} + +C_Status record_event(const C_Device device, C_Stream stream, C_Event event) { + return C_SUCCESS; +} + +C_Status destroy_event(const C_Device device, C_Event event) { + return C_SUCCESS; +} + +C_Status sync_device(const C_Device device) { + return C_SUCCESS; +} + +C_Status sync_stream(const C_Device device, C_Stream stream) { + return C_SUCCESS; +} + +C_Status sync_event(const C_Device device, C_Event event) { + return C_SUCCESS; +} + +C_Status stream_wait_event(const C_Device device, C_Stream stream, C_Event event) { + return C_SUCCESS; +} + +C_Status memstats(const C_Device device, size_t *total_memory, size_t *free_memory) { + *total_memory = global_total_mem_size; + *free_memory = global_free_mem_size + return C_SUCCESS; +} + +C_Status get_min_chunk_size(const C_Device device, size_t *size) { + *size = 1; + return C_SUCCESS; +} +``` + +## Step Two:Add Custom Kernel + +Taking the add as an example, this part will introduce how to implement a kernel and make it registered. + +Example: + +### 1. Determine the Kernel Statement + +Find the kernel statement of the header file `math_kernel.h` released by PaddlePaddle: + +```c++ +// Add the kernel function +// Model parameters: T - Data type +// Context - Device context +// Parameters: dev_ctx - Context object +// x - DenseTensor object +// y - DenseTensor object +// out - DenseTensor point +// Return: None +template +void AddKernel(const Context& dev_ctx, + const DenseTensor& x, + const DenseTensor& y, + DenseTensor* out); + +``` + +### 2.Kernel Implementation and Registration + +```c++ +// add_kernel.cc + +#include "paddle/phi/extension.h" // the header file on which the custom kernel depends + +namespace custom_cpu { + +// Kernel Implementation +template +void AddKernel(const Context& dev_ctx, + const phi::DenseTensor& x, + const phi::DenseTensor& y, + phi::DenseTensor* out) { + // Use Alloc API of dev_ctx to allocate storage of the template parameter T for the output parameter--out. + dev_ctx.template Alloc(out); + // Use numel API of DenseTensor to acquire the number of Tensor elements. + auto numel = x.numel(); + // Use data API of DenseTensor to acquire the data pointer of the template parameter T of the input parameter--x. + auto x_data = x.data(); + // Use data API of DenseTensor to acquire the data pointer of the template parameter T of the input parameter--y. + auto y_data = y.data(); + // Use data API of DenseTensor to acquire the data pointer of the template parameter T of the output parameter--out. + auto out_data = out->data(); + // Get the computing logic done + for (auto i = 0; i < numel; ++i) { + out_data[i] = x_data[i] + y_data[i]; + } +} + +} // namespace custom_cpu + +// In the global namespace, use the macro of registration to register the kernel. +// Register AddKernel of CustomCPU +// Parameters: add - Kernel name +// CustomCPU - Backend name +// ALL_LAYOUT - Memory layout +// custom_cpu::AddKernel - Name of the kernel function +// int - Data type name +// int64_t - Data type name +// float - Data type name +// double - Data type name +// phi::dtype::float16 - Data type name +PD_REGISTER_PLUGIN_KERNEL(add, + CustomCPU, + ALL_LAYOUT, + custom_cpu::AddKernel, + int, + int64_t, + float, + double, + phi::dtype::float16){} +``` + +## Step Three:Compile and Install + +### CMake Compilation + +**Edit CMakeLists.txt** + +``` +cmake_minimum_required(VERSION 3.10) + +project(paddle-custom_cpu CXX C) + +set(PLUGIN_NAME "paddle_custom_cpu") +set(PLUGIN_VERSION "0.0.1") + +set(PADDLE_PLUGIN_DIR "/opt/conda/lib/python3.7/site-packages/paddle-plugins/") +set(PADDLE_INC_DIR "/opt/conda/lib/python3.7/site-packages/paddle/include/") +set(PADDLE_LIB_DIR "/opt/conda/lib/python3.7/site-packages/paddle/fluid/") + +############ Third-party dependencies +set(BOOST_INC_DIR "/path/to/Paddle/build/third_party/boost/src/extern_boost") +set(GFLAGS_INC_DIR "/path/to/Paddle/build/third_party/install/gflags/include") +set(GLOG_INC_DIR "/path/to/Paddle/build/third_party/install/glog/include") +set(THREAD_INC_DIR "/path/to/Paddle/build/third_party/threadpool/src/extern_threadpool") +set(THIRD_PARTY_INC_DIR ${BOOST_INC_DIR} ${GFLAGS_INC_DIR} ${GLOG_INC_DIR} ${THREAD_INC_DIR}) + +include_directories(${PADDLE_INC_DIR} ${THIRD_PARTY_INC_DIR}) +link_directories(${PADDLE_LIB_DIR}) + +add_definitions(-DPADDLE_WITH_CUSTOM_DEVICE) # for out CustomContext temporarily +add_definitions(-DPADDLE_WITH_CUSTOM_KERNEL) # for out fluid seperate temporarily + +############ Compile plug-ins +add_library(${PLUGIN_NAME} SHARED runtime.cc add_kernel.cc) +target_link_libraries(${PLUGIN_NAME} PRIVATE :core_avx.so) # special name + +############ Assembly plug-ins +configure_file(${CMAKE_CURRENT_SOURCE_DIR}/setup.py.in + ${CMAKE_CURRENT_BINARY_DIR}/setup.py) + +add_custom_command(TARGET ${PLUGIN_NAME} POST_BUILD + COMMAND ${CMAKE_COMMAND} -E remove -f ${CMAKE_CURRENT_BINARY_DIR}/python/ + COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/python/ + COMMAND ${CMAKE_COMMAND} -E make_directory ${CMAKE_CURRENT_BINARY_DIR}/python/paddle-plugins/ + COMMAND ${CMAKE_COMMAND} -E copy_if_different ${CMAKE_CURRENT_BINARY_DIR}/lib${PLUGIN_NAME}.so ${CMAKE_CURRENT_BINARY_DIR}/python/paddle-plugins/ + COMMENT "Creating plugin dirrectories------>>>" +) + +add_custom_command(OUTPUT ${CMAKE_CURRENT_BINARY_DIR}/python/.timestamp + COMMAND python3 ${CMAKE_CURRENT_BINARY_DIR}/setup.py bdist_wheel + DEPENDS ${PLUGIN_NAME} + COMMENT "Packing whl packages------>>>" +) + +add_custom_target(python_package ALL DEPENDS ${CMAKE_CURRENT_BINARY_DIR}/python/.timestamp) +``` + +**Edit setup.py.in** + +CMake generates setup.py according to setup.py.in,and uses setuptools to encapsulate plug-ins into a wheel package. + +``` +from setuptools import setup, Distribution + +packages = [] +package_data = {} + +class BinaryDistribution(Distribution): + def has_ext_modules(self): + return True + +setup( + name = '@CMAKE_PROJECT_NAME@', + version='@PLUGIN_VERSION@', + description='Paddle CustomCPU plugin', + long_description='', + long_description_content_type="text/markdown", + author_email="Paddle-better@baidu.com", + maintainer="PaddlePaddle", + maintainer_email="Paddle-better@baidu.com", + project_urls={}, + license='Apache Software License', + packages= [ + 'paddle-plugins', + ], + include_package_data=True, + package_data = { + '': ['*.so', '*.h', '*.py', '*.hpp'], + }, + package_dir = { + '': 'python', + }, + zip_safe=False, + distclass=BinaryDistribution, + entry_points={ + 'console_scripts': [ + ] + }, + classifiers=[ + ], + keywords='Paddle CustomCPU plugin', +) +``` + +Compile plug-ins by following the command: + +```bash +$ mkdir build +$ cd build +$ cmake .. -DWITH_KERNELS=ON +$ make +``` + +After the compilation, make a wheel package under build/dist. + +### Setuptools Compilation + +**Edit setup.py** + +setuptools can be used to compile plug-ins and directly package them. + +``` +from setuptools import setup, Distribution, Extension +from setuptools.command.build_ext import build_ext +import os +import shutil + +packages = [] +package_data = {} + +class BinaryDistribution(Distribution): + def has_ext_modules(self): + return True + +for pkg_dir in ['build/python/paddle-plugins/']: + if os.path.exists(pkg_dir): + shutil.rmtree(pkg_dir) + os.makedirs(pkg_dir) + +ext_modules = [Extension(name='paddle-plugins.libpaddle_custom_cpu', + sources=['runtime.cc', 'add_kernel.cc'], + include_dirs=['/opt/conda/lib/python3.7/site-packages/paddle/include/'], + library_dirs=['/opt/conda/lib/python3.7/site-packages/paddle/fluid/'], + libraries=['core_avx.so'])] + +setup( + name='paddle-custom_cpu', + version='0.0.1', + description='Paddle CustomCPU plugin', + long_description='', + long_description_content_type="text/markdown", + author_email="Paddle-better@baidu.com", + maintainer="PaddlePaddle", + maintainer_email="Paddle-better@baidu.com", + project_urls={}, + license='Apache Software License', + ext_modules=ext_modules, + packages=[ + 'paddle-plugins', + ], + include_package_data=True, + package_data={ + '': ['*.so', '*.h', '*.py', '*.hpp'], + }, + package_dir={ + '': 'build/python', + }, + zip_safe=False, + distclass=BinaryDistribution, + entry_points={ + 'console_scripts': [ + ] + }, + classifiers=[ + ], + keywords='Paddle CustomCPU plugin', +) +``` + +Compile plug-ins by running the command: + +``` +$ python setup.py bdist_wheel +``` + +After the compilation, make a wheel package under the directory of dist. + +### Pip Installation + +Use pip to install a wheel package. + +``` +$ pip install build/dist/paddle_custom_cpu-0.0.1-cp37-cp37m-linux_aarch64.whl +``` + +## Step Four:Load and Use + +After installing plug-ins to their designated paths (site-packages/paddle-plugins), we can use the device backend of CustomCPU of PaddlePaddle to execute computation. + +First, check the custom devices of PaddlePaddle currently registered. + +``` +>>> paddle.device.get_all_custom_device_type() +['CustomCPU'] +``` + +Then, set the device backend to be used. + +``` +>>> paddle.set_device('CustomCPU') +``` + +Finally, use the new backend for computing tasks. + +``` +>>> x = paddle.to_tensor([1]) +>>> x +Tensor(shape=[1], dtype=int64, place=Place(CustomCPU:0), stop_gradient=True, + [1]) +>>> x + x +Tensor(shape=[1], dtype=int64, place=Place(CustomCPU:0), stop_gradient=True, + [2]) +``` diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/context_api_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/context_api_en.md new file mode 100644 index 00000000000..bff84489435 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/context_api_en.md @@ -0,0 +1,150 @@ +# Context APIs + +## CustomContext +`CustomContext` is the acutal parameter of the template parameter Context of the custom kernel function. For details, please refer to [custom_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/backends/custom/custom_context.h). + +```c++ + // Constructor + // Parameter:place - CustomPlace object + // Return:None + explicit CustomContext(const CustomPlace&); + + // Destructor + virtual ~CustomContext(); + + // Get the contextual place in the device + // Parameter:None + // Return:place - Place object + const Place& GetPlace() const override; + + // Get the contextual stream in the device + // Parameter:None + // Return:stream - void* pointer + void* stream() const; + + // Wait for the completion of operations on the stream + // Parameter:None + // Return:None + void Wait() const override; +``` + +## DeviceContext +`CustomContext` originates from `DeviceContextp`,please refer to [device_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/device_context.h) + +```c++ + // No-Parameter constructor + DeviceContext(); + + // Copy constructor + DeviceContext(const DeviceContext&); + + // Move constructor + DeviceContext(DeviceContext&&); + + // Move assignment operator + DeviceContext& operator=(DeviceContext&&); + + // Destructor + virtual ~DeviceContext(); + + // Set device allocator + // Parameter:Allocator pointer + // Return:None + void SetAllocator(const Allocator*); + + // Set host allocator + // Parameter:Allocator pointer + // Return:None + void SetHostAllocator(const Allocator*); + + // Set zero-size allocator + // Parameter:Allocator pointer + // Return:None + void SetZeroAllocator(const Allocator*); + + // Get Allocator + // Parameter:None + // Return:Allocator object + const Allocator& GetAllocator() const; + + // Get Host allocator + // Parameter:None + // Return:Allocator object + const Allocator& GetHostAllocator() const; + + // Get zero-size allocator + // Parameter:None + // Return:Allocator object + const Allocator& GetZeroAllocator() const; + + // Allocate the device memory for Tensor + // Parameter: TensorBase pointer + // dtype - DataType variable + // requested_size - size_t variable with the default value of 0 + // Return:data pointer - void* pointer + void* Alloc(TensorBase*, DataType dtype, size_t requested_size = 0) const; + + // Allocate device memory for Tensor + // Template Parameter:T - data type + // Parameter:TensorBase pointer + // requested_size - size_t variable, 0 by default + // Return:data pointer - T* pointer + template + T* Alloc(TensorBase* tensor, size_t requested_size = 0) const; + + // Allocate host memory for Tensor + // Parameter:TensorBase pointer + // dtype - DataType variable + // requested_size - size_t variable, 0 by default + // Return:data pointer - void* pointer + void* HostAlloc(TensorBase* tensor, + DataType dtype, + size_t requested_size = 0) const; + + // Allocate host storage for Tensor + // Template Parameter:T - data type + // Parameter:TensorBase pointer + // requested_size - size_t variable, 0 by default + // Return:data pointer - T* data pointer + template + T* HostAlloc(TensorBase* tensor, size_t requested_size = 0) const; + + // Get the contextual information of the place, and implement sub interfaces + // Parameter:None + // Return:place - Place object + virtual const Place& GetPlace() const = 0; + + // Wait for the completion of operations on the stream, and implement sub interfaces + // Parameter:None + // Return:None + virtual void Wait() const {} + + // Set the random number generator + // Parameter:Generator pointer + // Return:None + void SetGenerator(Generator*); + + // Get the random number generator + // Parameter:None + // Return:Generator pointer + Generator* GetGenerator() const; + + // Set the Host random number generator + // Parameter:Generator pointer + // Return:None + void SetHostGenerator(Generator*); + + // Get the Host random number generator + // Parameter:None + // Return:Generator pointer + Generator* GetHostGenerator() const; + +``` + +## Relevant Information + +- `Place` and `CustomPlace`:please refer to [place.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/place.h) +- `Allocation` and `Allocator`:please refer to [allocator.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/allocator.h) +- `TensorBase`:please refer to [tensor_base.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/tensor_base.h) +- `DataType`:please refer to [data_type.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/data_type.h) +- `Generator`:please refer to [generator.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/generator.h) diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst b/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst new file mode 100644 index 00000000000..0e91dae3fa7 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst @@ -0,0 +1,20 @@ +############# +Kernel Implementation APIs +############# + +The custom kernel-function implementation mainly depends on two parts: 1.APIs released by PaddlePaddle, including the context API, the tensor API, and the exception API; 2. APIs of the device encapsulation library. And the C++ API of PaddlePaddle has been released by the header file. + + +- `Context API <./context_api_en.html>`_ : about the C++ API of the device context +- `Tensor API <./tensor_api_en.html>`_ : about the C++ API of Tensor +- `Exception API <./exception_api_en.html>`_ : about the C++ API of exception handling + + +Note:There are abundant C++ API of PaddlePaddle. Three APIs will be introduced here and related classes and documents listed in corresponding websites are provided for developers. + +.. toctree:: + :hidden: + + context_api_en.md + tensor_api_en.md + exception_api_en.md diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/exception_api_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/exception_api_en.md new file mode 100644 index 00000000000..c1ded39d75c --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/exception_api_en.md @@ -0,0 +1,62 @@ +# Exception API + + +## PADDLE_ENFORCE + +How to use: + +```c++ + PADDLE_ENFORCE_{TYPE}(cond_a, // Condition A + cond_b, // Condition B, optional based on the TYPE + phi::errors::{ERR_TYPE}("{ERR_MSG}")); +``` + +There are different conditions according to `TYPE`: + +| Exception Macro | Basis | Error | +|---|---|---| +| PADDLE_ENFORCE_EQ | cond_a == cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_NE | cond_a != cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_GT | cond_a > cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_GE | cond_a >= cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_LT | cond_a < cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_LE | cond_a <= cond_b | Raise ERR_TYPE exception and report ERR_MSG | +| PADDLE_ENFORCE_NOT_NULL | cond_a != nullptr | Raise ERR_TYPE exception and report ERR_MSG | + +`ERR_TYPE` supports: + +| Type | +|---| +| InvalidArgument | +| NotFound | +| OutOfRange | +| AlreadyExists | +| ResourceExhausted | +| PreconditionNotMet | +| PermissionDenied | +| ExecutionTimeout | +| Unimplemented | +| Unavailable | +| Fatal | +| External | + +`ERR_MSG` is a C-style string C, supporting variable-length arguments. + +Example: + +```c++ +// If num_col_dims >= 2 && num_col_dims <= src.size() is not true, report the InvalidArgument exception. +// Print relevant tips +PADDLE_ENFORCE_EQ( + (num_col_dims >= 2 && num_col_dims <= src.size()), + true, + phi::errors::InvalidArgument("The num_col_dims should be inside [2, %d] " + "in flatten_to_3d, but received %d.", + src.size(), + num_col_dims)); +``` + +## Relevant Information + +- `PADDLE_ENFORCE`:please refer to [enforce.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/enforce.h) +- `errors`:please refer to [errors.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/errors.h) diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/kernel_declare_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/kernel_declare_en.md new file mode 100644 index 00000000000..c262e5af953 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/kernel_declare_en.md @@ -0,0 +1,83 @@ +# Kernel Function Declaration + +PaddlePaddle has released the kernel declaration through the header file, and the framework is uniform both inside and outside. + +Custom kernel editing should be based on a specific kernel function declaration. The header file is under `include/paddle/phi/kernels/`. + +The format of the declaration is as follows: + +```c++ +template +void KernelNameKernel(const Context& dev_ctx, + InputTensor(s), + Attribute(s), + OutTensor(s)); +``` + +Agreement: + +1. Template Parameter:It is fixed in format. The data type of the first parameter is `T`,and that of the second is `Context`. +2. Return:`void` is the pattern. +3. Naming:Camel case: kernel name + "Kernel",such as `SoftmaxKernel` +4. Parameter:Context parameter, InputTensor,Attribute,and OutTensor, all arranged in order: +- Context Parameter:It belongs to `const Context&`. + - `CustomContext` corresponding with the custom kernel. You can refer to [custom_context.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/backends/custom/custom_context.h) +- InputTensor:Number >=0,and the types include: + - `const DenseTensor&` Please refer to [dense_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/dense_tensor.h) + - `const SelectedRows&` Please refer to [selected_rows.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/selected_rows.h) + - `const SparseCooTensor&` Please refer to [sparse_coo_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/sparse_coo_tensor.h) + - `const SparseCsrTensor&` Please refer to [sparse_csr_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/sparse_csr_tensor.h) + - `const std::vector&` + - `const std::vector&` + - `const std::vector&` +- Attribute:Number >=0,and the types include: + - `bool` + - `float` + - `double` + - `int` + - `int64_t` + - `phi::dtype::float16` Please refer to [float16.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/float16.h) + - `const Scalar&` Please refer to [scalar.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/scalar.h) + - `DataType` Please refer to [data_type.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/data_type.h) + - `DataLayout` Please refer to [layout.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/layout.h) + - `Place` Please refer to [place.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/place.h) + - `const std::vector&` + - `const ScalarArray&` Please refer to [scalar_array.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/scalar_array.h) + - `const std::vector&` + - `const std::string&` + - `const std::vector&` + - `const std::vector&` + - `const std::vector&` + - `const std::vector&` +- OutTensor:Number >0,and the types include: + - `DenseTensor*` + - `SelectedRows*` + - `SparseCooTensor*` + - `SparseCsrTensor*` + - `std::vector` + - `std::vector` + - `std::vector` + +For example,when the kernel function of `softmax` is in `softmax_kernel.h`: + +```c++ +// Softmax +// Template Parameter: T - data type +// Context - the device context +// Parameter: dev_ctx - object of the Context +// x - DenseTensor object +// axis - int type +// dtype - DataType type +// out - DenseTensor pointer +// Return: None +template +void SoftmaxKernel(const Context& dev_ctx, + const DenseTensor& x, + int axis, + DataType dtype, + DenseTensor* out); +``` + +> Note: +> 1. The kernel function declaration is the basis of the registration and the framework invocation of the custom kernel. It is released by the framework and required to be observed. +> 2. The kernel function declaration cannot perfectly match the header file. You can find the declaration you need by searching the name of the function. diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md new file mode 100644 index 00000000000..1ab121aa42d --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md @@ -0,0 +1,62 @@ +# Kernel Registration API + +The registration macro of PaddlePaddle helps to register the custom kernel,which can be called by the PaddlePaddle framework. + +The registration macro should be put in a global space. + +For the basic format of the registration macro, please refer to [kernel_registry.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/kernel_registry.h) + +```c++ +/** PD_REGISTER_PLUGIN_KERNEL + * + * Used to register kernels for plug-in backends. + * Support user-defined backend such as 'Ascend910'. + */ +PD_REGISTER_PLUGIN_KERNEL(kernel_name, backend, layout, meta_kernel_fn, ...)) {} +``` + +Explanation: + +- Name of the macro:`PD_REGISTER_PLUGIN_KERNEL` +- First parameter:kernel_name,which is the same both inside and outside. You can refer to registration names of the same kernel functions of CPU, such as `softmax`. +- Second parameter:backend,which can be customized. But its name must be the same as that of the custom runtime, such as `Ascend910`. +- Third parameter:layout,the enumeration of `DataLayout`. For the setting, please refer to [layout.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/layout.h) +- Fourth parameter:meta_kernel_fn,the name of a kernel function. Here, the template parameter is not included, such as `my_namespace::SoftmaxKernel`. +- Variable-length data type parameter: includes basic C++ data types or types defined by PaddlePaddle like `phi::dtype::float16`、`phi::dtype::bfloat16`、`phi::dtype::complex`. You can refer to [data_type.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/data_type.h) +- End:the function body. You can set the kernel if necessary. If not, keep `{}`. + +>Explanation: the declaration corresponding to the end function body: +>```c++ +>// Kernel Parameter Definition +>// Parameter: kernel_key - KernelKey object +>// kernel - Kernel pointer +>// Return: None +>void __PD_KERNEL_args_def_FN_##kernel_name##_##backend##_##layout( +> const ::phi::KernelKey& kernel_key, ::phi::Kernel* kernel); +>``` +> You can use the parameters `kernel_key` and `kernel` in the function body,and customize the kernel in its registration. + +Take the registration of the CustomCPU backend kernel of `softmax` as an example: + +```c++ +// The registration of the CustomCPU backend kernel of `softmax` +// Global naming space +// Parameter: softmax - Kernel name +// CustomCPU - Backend name +// ALL_LAYOUT - Storage layout +// custom_cpu::SoftmaxKernel - Kernel function name +// float - name of the data type +// double - name of the data type +// phi::dtype::float16 - name of the data type +PD_REGISTER_PLUGIN_KERNEL(softmax, + CustomCPU, + ALL_LAYOUT, + custom_cpu::SoftmaxKernel, + float, + double, + phi::dtype::float16) {} +``` + +> Note: +> 1. When the backend is accessed through the custom runtime, the backend parameter must be the same as its name. +> 2. Except the requirement of the end function body of the registration macro,keep the empty function body. You can refer to other backends within the PaddlePaddle framework. diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/tensor_api_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/tensor_api_en.md new file mode 100644 index 00000000000..0cebe64df32 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/tensor_api_en.md @@ -0,0 +1,190 @@ +# Tensor APIs + +There are many kinds of tensors released by PaddlePaddle, and their base class is `TensorBase`, and here will take the commonly-used API `DenseTensor` as an example. For the `TensorBase` and other tensors, please refer to the link at the end of this text. + +## DenseTensor + +All element data of `DenseTensor` are stored in contiguous memory, and you can refer to [dense_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/dense_tensor.h). + +```c++ + // Construct the DenseTensor and allocate memory + // Parameter:a - pointer type of the Allocator + // meta - DenseTensorMeta object + // Return:None + DenseTensor(Allocator* a, const DenseTensorMeta& meta); + + // Construct the DenseTensor and allocate memory + // Parameter:a - pointer type of the Allocator + // meta - DenseTensorMeta moving object + // Return:None + DenseTensor(Allocator* a, DenseTensorMeta&& meta); + + // Construct the DenseTensor and allocate memory + // Parameter:holder - shared pointer of Allocation + // meta - DenseTensorMeta moving object + // Return:None + DenseTensor(const std::shared_ptr& holder, + const DenseTensorMeta& meta); + + // Move Constructor + // Parameter:other - DenseTensor moving object + // Return:None + DenseTensor(DenseTensor&& other) = default; + + // Copy Constructor + // Parameter:other - DenseTensor object + // Return:None + DenseTensor(const DenseTensor& other); + + // Assignment + // Parameter:other - DenseTensor object + // Return:DenseTensor object + DenseTensor& operator=(const DenseTensor& other); + + // Move Assignment + // Parameter:other - DenseTensor object + // Return:DenseTensor object + DenseTensor& operator=(DenseTensor&& other); + + // No-Parameter Constructor + DenseTensor(); + + // Destructor + virtual ~DenseTensor() = default; + + // Get the type name,static function + // Parameter:None + // Return:string pointer + static const char* name(); + + // Acquire the number of elements of the tensor + // Parameter:None + // Return:int64_t categorical variable + int64_t numel() const override; + + // Acquire the dims of tbe tensor + // Parameter:None + // Return:DDim object + const DDim& dims() const noexcept override; + + // Acquire the lod of the tensor + // Parameter:None + // Return:LoD object + const LoD& lod() const noexcept; + + // Acquire the data type of the Tensor + // Parameter:None + // Return: DataType categorical variable + DataType dtype() const noexcept override; + + // Acquire the memory layout of the tensor + // Parameter:None + // Return:DataLayout categorical variable + DataLayout layout() const noexcept override; + + // Acquire the place of the tensor + // Parameter:None + // Return:Place categorical variable + const Place& place() const override; + + // Acquire the meta of the tensor + // Parameter:None + // Return:DenseTensorMeta object + const DenseTensorMeta& meta() const noexcept; + + // Set the meta of the tensor + // Parameter:meta - DenseTensorMeta move object + // Return:None + void set_meta(DenseTensorMeta&& meta); + + // Set the meta of the tensor + // Parameter:meta - DenseTensorMeta object + // Return:None + void set_meta(const DenseTensorMeta& meta); + + // Check whether the meta of the tensor is valid + // Parameter:None + // Return:bool categorical variable + bool valid() const noexcept override; + + // Check wether the tensor is initialized + // Parameter:None + // Return:bool categorical variable + bool initialized() const override; + + // Allocate memory for the tensor + // Parameter:allocator - Allocator pointer type + // dtype - DataType variable + // requested_size - size_t categorical variable + // Return:void* pointer + void* AllocateFrom(Allocator* allocator, + DataType dtype, + size_t requested_size = 0) override; + + // Check whether memory is shared with other tensors + // Parameter:b - DenseTensor object + // Return:bool categorical variable + bool IsSharedWith(const DenseTensor& b) const; + + // Modify the dims of the tensor and allocate memory + // Parameter:dims - DDim object + // Return:None + void ResizeAndAllocate(const DDim& dims); + + // Modify the dims of the tensor + // Parameter:dims - DDim object + // Return:DenseTensor object + DenseTensor& Resize(const DDim& dims); + + // Reset the LoD of the tensor + // Parameter:lod - LoD object + // Return:None + void ResetLoD(const LoD& lod); + + // Acquire the memory size of the tensor + // Parameter:None + // Return:size_t categorical variable + size_t capacity() const; + + // Acquire the unchangeable data pointer of the tensor + // Template parameter:T - data type + // Parameter:None + // Return:the unchangeable T data pointer + template + const T* data() const; + + // Acquire the unchangeable data pointer of the tensor + // Parameter:None + // Return:the unchangeable void data pointer + const void* data() const; + + // Acquire the revisable data pointer of the tensor + // Template parameter:T - data type + // Parameter:None + // Return:the revisable T data pointer + template + T* data(); + + // Acquire the revisable data pointer of the tensor + // Parameter:None + // Return:the revisable void data pointer + void* data(); +``` + +## Other Tensors + +- `TensorBase`:Please refer to [tensor_base.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/tensor_base.h) +- `SelectedRows`:Please refer to [selected_rows.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/selected_rows.h) +- `SparseCooTensor`:Please refer to [sparse_coo_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/sparse_coo_tensor.h) +- `SparseCsrTensor`:Please refer to [sparse_csr_tensor.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/sparse_csr_tensor.h) + + +## Relevant Information + +- `Allocation` and `Allocator`:Please refer to [allocator.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/allocator.h) +- `DenseTensorMeta`:Please refer to [tensor_meta.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/tensor_meta.h) +- `DDim`:Please refer to [ddim.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/ddim.h) +- `LoD`:Please refer to [lod_utils.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/core/lod_utils.h) +- `DataType`:Please refer to [data_type.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/data_type.h) +- `DataLayout`:Please refer to [layout.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/layout.h) +- `Place`:Please refer to [place.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/common/place.h) diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_en.rst b/docs/dev_guides/custom_device_docs/custom_kernel_en.rst new file mode 100644 index 00000000000..f1dabbb80d6 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_kernel_en.rst @@ -0,0 +1,19 @@ +#################### +Custom Kernel +#################### + +The custom kernel is the implementation of corresponding operators of the kernel function (or kernel). The PaddlePaddle framework provides the custom kernel for the external device registered by the custom runtime, achieving the compiling, registration, and automatic loading of the kernel independent of the framework. +The implementation of the custom kernel is based on the public kernel statement of PaddlePaddle, and public C++ API and register macro. + + +- `Kernel function statement <./custom_kernel_docs/kernel_declare_cn.html>`_ : to introduce the kernel statement of PaddlePaddle +- `Kernel implementation API <./custom_kernel_docs/cpp_api_cn.html>`_ : to introduce the C++ API required in the implementation of the custom function. +- `Kernel register API <./custom_kernel_docs/register_api_cn.html>`_ : to introduce the register macro of the custom kernel. + + +.. toctree:: + :hidden: + + custom_kernel_docs/kernel_declare_cn.md + custom_kernel_docs/cpp_api_cn.rst + custom_kernel_docs/register_api_cn.md diff --git a/docs/dev_guides/custom_device_docs/custom_runtime_en.rst b/docs/dev_guides/custom_device_docs/custom_runtime_en.rst new file mode 100644 index 00000000000..9f4779c7e4c --- /dev/null +++ b/docs/dev_guides/custom_device_docs/custom_runtime_en.rst @@ -0,0 +1,144 @@ +############# +Custom Runtime +############# + +Custom Runtime offers a new method to register the runtime of new devices via plug-ins. Responsible for the management of PaddlePaddle devices and Runtime/Driver API, DeviceManager provides a uniform API for the framework to invoke device capabilities, offers a series of APIs to register Custom Runtime, and ensure that the binary system is compatible through C API. The APIs can be found in `device_ext.h `_ . Developers can add custom runtime for PaddlePaddle only by implementing these APIs. + +- `Data type <./runtime_data_type_cn.html>`_ : to introduce definitions of data types of custom runtime. +- `Device API <./device_api_cn.html>`_ : to introduce definitions and functions of Device APIs. +- `Memory API <./memory_api_cn.html>`_ : to introduce definitions and functions of Memory APIs. +- `Stream API <./stream_api_cn.html>`_ : to introduce definitions and functions of Stream APIs. +- `Event API <./event_api_cn.html>`_ : to introduce definitions and functions of Event APIs. + + +Device APIs +############ + ++------------------------+----------------------------------------+ +| API | Function | ++========================+========================================+ +| initialize | To initialize the device backend | ++------------------------+----------------------------------------+ +| finalize | To de-initialize the device backend | ++------------------------+----------------------------------------+ +| init_device | To initialize the designated device | ++------------------------+----------------------------------------+ +| deinit_device | To de-initialize the designated device | ++------------------------+----------------------------------------+ +| set_device | To set the current device | ++------------------------+----------------------------------------+ +| get_device | To get the current device | ++------------------------+----------------------------------------+ +| synchronize_device | To synchronize the desginated device | ++------------------------+----------------------------------------+ +| get_device_count | To count available devices | ++------------------------+----------------------------------------+ +| get_device_list | To get the list of available devices | ++------------------------+----------------------------------------+ +| get_compute_capability | To get computing capability of devices | ++------------------------+----------------------------------------+ +| get_runtime_version | To get the runtime version | ++------------------------+----------------------------------------+ +| get_driver_version | To get the driver version | ++------------------------+----------------------------------------+ + + +Memory APIs +############ + ++---------------------------+-------------------------------------------------------------------+ +| API | Function | ++===========================+===================================================================+ +| device_memory_allocate | To allocate the device memory | ++---------------------------+-------------------------------------------------------------------+ +| device_memory_deallocate | To deallocate the device memory | ++---------------------------+-------------------------------------------------------------------+ +| host_memory_allocate | To allocate pinned host memory | ++---------------------------+-------------------------------------------------------------------+ +| host_memory_deallocate | To deallocate pinned host memory | ++---------------------------+-------------------------------------------------------------------+ +| unified_memory_allocate | To allocated unified memory | ++---------------------------+-------------------------------------------------------------------+ +| unified_memory_deallocate | To deallocate unified memory | ++---------------------------+-------------------------------------------------------------------+ +| memory_copy_h2d | To copy synchronous memory from host to device | ++---------------------------+-------------------------------------------------------------------+ +| memory_copy_d2h | To copy synchronous memory from device to host | ++---------------------------+-------------------------------------------------------------------+ +| memory_copy_d2d | To copy synchronous memory in the device | ++---------------------------+-------------------------------------------------------------------+ +| memory_copy_p2d | To copy synchronous memory between devices | ++---------------------------+-------------------------------------------------------------------+ +| async_memory_copy_h2d | To copy asynchronous memory from host to device | ++---------------------------+-------------------------------------------------------------------+ +| async_memory_copy_d2h | To copy asynchronous memory from device to host | ++---------------------------+-------------------------------------------------------------------+ +| async_memory_copy_d2d | To copy asynchronous memory in the device | ++---------------------------+-------------------------------------------------------------------+ +| async_memory_copy_p2d | To copy asynchronous memory between devices | ++---------------------------+-------------------------------------------------------------------+ +| device_memory_set | To fill the device memory | ++---------------------------+-------------------------------------------------------------------+ +| device_memory_stats | To measure device memory utilization | ++---------------------------+-------------------------------------------------------------------+ +| device_min_chunk_size | To check the minimum size of device memory chunks | ++---------------------------+-------------------------------------------------------------------+ +| device_max_chunk_size | To check the maximum size of device memory chunks | ++---------------------------+-------------------------------------------------------------------+ +| device_max_alloc_size | To check the maximum size of allocatable device memory | ++---------------------------+-------------------------------------------------------------------+ +| device_extra_padding_size | To check the extra padding size of device memory | ++---------------------------+-------------------------------------------------------------------+ +| device_init_alloc_size | To check the size of allocated device memory after initialization | ++---------------------------+-------------------------------------------------------------------+ +| device_realloc_size | To check the size of reallocated device memory | ++---------------------------+-------------------------------------------------------------------+ + + +Stream APIs +############ + ++---------------------+-------------------------------------------------------------------+ +| API | Function | ++=====================+===================================================================+ +| create_stream | To create a stream object | ++---------------------+-------------------------------------------------------------------+ +| destroy_stream | To destroy a stream object | ++---------------------+-------------------------------------------------------------------+ +| query_stream | To query whether all the tasks on the stream are done | ++---------------------+-------------------------------------------------------------------+ +| synchronize_stream | To synchronize the stream and wait for the completion of all tasks| ++---------------------+-------------------------------------------------------------------+ +| stream_add_callback | To add a host and call it back on the stream | ++---------------------+-------------------------------------------------------------------+ +| stream_wait_event | To wait for the completion of an event on the stream | ++---------------------+-------------------------------------------------------------------+ + + +Event APIs +############ + ++-------------------+---------------------------------------------------------+ +| API | Function | ++===================+=========================================================+ +| create_event | To create an event | ++-------------------+---------------------------------------------------------+ +| destroy_event | To destroy an event | ++-------------------+---------------------------------------------------------+ +| record_event | To record an event on the stream | ++-------------------+---------------------------------------------------------+ +| query_event | To query whether the event is done | ++-------------------+---------------------------------------------------------+ +| synchronize_event | To synchronize the event and wait for its completion | ++-------------------+---------------------------------------------------------+ + + +.. toctree:: + :hidden: + + runtime_data_type_cn.md + device_api_cn.md + memory_api_cn.md + stream_api_cn.md + event_api_cn.md + diff --git a/docs/dev_guides/custom_device_docs/device_api_en.md b/docs/dev_guides/custom_device_docs/device_api_en.md new file mode 100644 index 00000000000..8ee2f32416a --- /dev/null +++ b/docs/dev_guides/custom_device_docs/device_api_en.md @@ -0,0 +1,185 @@ +# Device APIs + +## initialize 【optional】 + +### Definition + +```c++ +C_Status (*initialize)() +``` + +### Description + +It initializes the device backend, such as the runtime or the driver. During the device registration, it is the first to be invoked. But if the API is not implemented, it will not be invoked. + +## finalize 【optional】 + +### Definition + +```c++ +C_Status (*finalize)() +``` + +### Description + +It deinitializes the device backend. For example, the deinitialization is performed during the exit of the runtime or the driver. The API is invoked till the end of the exit. But if it is not implemented, it will not be invoked. + +## init_device 【optional】 + +### Definition + +```c++ +C_Status (*init_device)(const C_Device device) +``` + +### Description + +It initializes the designated device and initializes all available devices during the plug-in registration. If not implemented, the API will not be invoked, and it is invoked only after initialization. + +### Parameter + +device - the device needed to be initialized。 + +## deinit_device 【optional】 + +### Definition + +```c++ +C_Status (*deinit_device)(const C_Device device) +``` + +### Description + +It finalizes the designated device, and deallocate resources allocated to all devices. The API is inovked during the exit. If not implemented, it will not be inovked and it is invoked before finalization. + +### Parameter + +device - the device needed to be finalized + +### Definition + +## set_device 【required】 + +```c++ +C_Status (*set_device)(const C_Device device) +``` + +### Description + +It sets the current device, where following tasks are executed. + +### Parameter + +device - the device needed to be set + +## get_device 【required】 + +### Definition + +```c++ +C_Status (*get_device)(const C_Device device) +``` + +### Description + +It acquires the current device + +### Parameter + +device - to store the current device + +## synchronize_device 【required】 + +### Definition + +```c++ +C_Status (*synchronize_device)(const C_Device device) +``` + +### Description + +It synchronizes the device and waits for the completion of tasks on the device. + +### Parameter + +device - the device required to be synchronized + +## get_device_count 【required】 + +### Definition + +```c++ +C_Status (*get_device_count)(size_t* count) +``` + +### Description + +It counts available devices. + +### Parameter + +count - the number of available devices in storage + +## get_device_list 【required】 + +### Definition + +```c++ +C_Status (*get_device_list)(size_t* devices) +``` + +### Description + +It acquires the number list of all currently available devices. + +### Parameter + +devices - numbers of available devices in storage + +## get_compute_capability 【required】 + +### Definition + +```c++ +C_Status (*get_compute_capability)(size_t* compute_capability) +``` + +### Description + +It gets the computing capability of the device. + +### Parameter + +compute_capability - the computing capability of the stored device + +## get_runtime_version 【required】 + +### Definition + +```c++ +C_Status (*get_runtime_version)(size_t* version) +``` + +### Description + +It acquires the runtime version. + +### Parameter + +version - the runtime version in storage + +## get_driver_version 【required】 + +### Definition + +```c++ +C_Status (*get_driver_version)(size_t* version) +``` + +### Description + +It gets the driver version. + +### Parameter + +version - the version of the stored driver diff --git a/docs/dev_guides/custom_device_docs/event_api_en.md b/docs/dev_guides/custom_device_docs/event_api_en.md new file mode 100644 index 00000000000..887bedb04b2 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/event_api_en.md @@ -0,0 +1,93 @@ +# Event APIs + +## create_event 【required】 + +### Definition + +```c++ +C_Status (*create_event)(const C_Device device, C_Event* event) +``` + +### Description + +It creates an event, which is used to synchronize tasks of different streams within the framework. When the device does not support asynchronous execution, empty implementation of the API is required. + +### Paremeter + +device - the device to be used + +event - the created event in storage + +## destroy_event 【required】 + +### Definition + +```c++ +C_Status (*destroy_event)(const C_Device device, C_Event event) +``` + +### Description + +It destroys an event. When the device does not support asynchronous execution, the API requires an empty implementation. + +### Parameter + +device - the device to be used + +event - the event needed to be destroyed + +## record_event 【required】 + +### Definition + +```c++ +C_Status (*record_event)(const C_Device device, C_Stream stream, C_Event event) +``` + +### Description + +It records the event on the stream. When the device does not support asynchronous execution, empty implementation of the API is required. + +### Parameter + +device - the device to be used + +stream - the stream where the event is recorded + +event - the recorded event + +## query_event 【optional】 + +### Definition + +```c++ +C_Status (*query_event)(const C_Device device, C_Event event) +``` + +### Description + +It queries whether the event is complete. If not implemented, PaddlePaddle will use synchronize_event instead. + +### Parameter + +device - the device to be used + +event - the event to be queried + +## synchronize_event 【required】 + +### Definition + +```c++ +C_Status (*synchronize_event)(const C_Device device, C_Event event) +``` + +### Description + +It synchronizes the event and waits for its completion. When the device does not support asynchronous execution, empty implementation of the API is required. + +### Parameter + +device - the device to be used + +event - the event required to be synchronized diff --git a/docs/dev_guides/custom_device_docs/index_en.rst b/docs/dev_guides/custom_device_docs/index_en.rst new file mode 100644 index 00000000000..30a1bc77e03 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/index_en.rst @@ -0,0 +1,19 @@ +#################### +Custom Device Access Guide +#################### + +The custom device access decouples the framework from the device and makes it available to extend the backend of the PaddlePaddle device via plug-ins. In this way, developers can make a plug-in for PaddlePaddle only by implementing the standard API and compiling it into a dynamic-link library, instead of by modifying the code of PaddlePaddle. Now it is easier to develop hardware backends for PaddlePaddle. + +The custom device access is composed of custom runtime and custom kernel. With the two modules, users can connect new custom devices to PaddlePaddle according to their own needs. + +- `Custom Runtime <./custom_runtime_en.html>`_ : Introduction of custom runtime of the PaddlePaddle framework +- `Custom Kernel <./custom_kernel_en.html>`_ : Introduction of custom kernel of the PaddlePaddle framework +- `Example of Device Access <./custom_kernel_en.html>`_ : To demonstrate how to connect new custom devices to PaddlePaddle + +.. toctree:: + :hidden: + + + custom_runtime_en.rst + custom_kernel_en.rst + custom_device_example_en.md diff --git a/docs/dev_guides/custom_device_docs/memory_api_en.md b/docs/dev_guides/custom_device_docs/memory_api_en.md new file mode 100644 index 00000000000..876d0cba3aa --- /dev/null +++ b/docs/dev_guides/custom_device_docs/memory_api_en.md @@ -0,0 +1,459 @@ +# Memory APIs + +## device_memory_allocate 【required】 + +### Definition + +```c++ +C_Status (*device_memory_allocate)(const C_Device device, void** ptr, size_t size) +``` + +### Description + +It allocates the device memory. + +### Parameter + +device - the device to be used + +ptr - the address of the allocated device memory + +size - the size of the device memory needed to be allocated (in byte) + +## device_memory_deallocate 【required】 + +### Definition + +```c++ +C_Status (*device_memory_deallocate)(const C_Device device, void* ptr, size_t size) +``` + +### Description + +It deallocates the device storage. + +### Parameter + +device - the device to be used + +ptr - the address of the device memory needed to be deallocated + +size - the size of the device memory needed to be deallocated (in byte) + +## host_memory_allocate 【optional】 + +### Definition + +```c++ +C_Status (*host_memory_allocate)(const C_Device device, void** ptr, size_t size) +``` + +### Description + +It allocates pinned host memory. + +### Parameter + +device - the device to be used + +ptr - the address of allocated host memory + +size - the size of memory needed to be allocated (in byte) + +## host_memory_deallocate 【optional】 + +### Definition + +```c++ +C_Status (*host_memory_deallocate)(const C_Device device, void* ptr, size_t size) +``` + +### Description + +It deallocates the pinned host memory. + +### Parameter + +device - the device to be used + +ptr - the address of host memory needed to be deallocated + +size - the size of memory needed to be deallocated (in byte) + +## unified_memory_allocate 【optional】 + +### Definition + +```c++ +C_Status (*unified_memory_allocate)(const C_Device device, void** ptr, size_t size) +``` + +### Description + +It allocates unified memory. + +### Parameter + +device - the device to be used + +ptr - unified memory address + +size - the size of memory needed to be allocated (in byte) + +## unified_memory_deallocate 【optional】 + +### Definition + +```c++ +C_Status (*unified_memory_deallocate)(const C_Device device, void** ptr, size_t size) +``` + +### Description + +It deallocates unified memory. + +### Parameter + +device - the device to be used + +ptr - the address of unified memory needed to be deallocated + +size - the size of memory needed to be deallocated (in byte) + +## memory_copy_h2d 【required】 + +### Definition + +```c++ +C_Status (*memory_copy_h2d)(const C_Device device, void* dst, const void* src, size_t size) +``` + +### Description + +It copies synchronous memory from the host to the device. + +### Parameter + +device - the device to be used + +dst - the address of destination device memory + +src - the address of the source host memory + +size - the size of memory needed to be copied (in byte) + +## memory_copy_d2h 【required】 + +### Definition + +```c++ +C_Status (*memory_copy_d2h)(const C_Device device, void* dst, const void* src, size_t size) +``` + +### Description + +It copies synchronous memory from the device to the host. + +### Parameter + +device - the device to be used + +dst - the address of the destination host memory + +src - the address of the source device memory + +size - the size of memory needed to be copied (in byte) + +## memory_copy_d2d 【required】 + +### Definition + +```c++ +C_Status (*memory_copy_d2d)(const C_Device device, void* dst, const void* src, size_t size) +``` + +### Description + +It copies synchronous memory in the device. + +### Parameter + +device - the device to be used + +dst - the address of the destination device memroy + +src - the address of the source device memory + +size - the size of memory needed to be copied (in byte) + +## memory_copy_p2p 【optional】 + +### Definition + +```c++ +C_Status (*memory_copy_p2p)(const C_Device dst_device, const C_Device src_device, void* dst, const void* src, size_t size) +``` + +### Description + +It copies synchronous memory between devices. + +### Parameter + +dst_device - the destination device + +src_device - the source device + +dst - the address of destination device memory + +src - the address of source device memory + +size - the size of memory needed to be copied (in byte) + +## async_memory_copy_h2d 【optional】 + +### Definition + +```c++ +C_Status (*async_memory_copy_h2d)(const C_Device device, C_Stream stream, void* dst, const void* src, size_t size) +``` + +### Description + +It copies asynchronous memory from the host to the device. If it is not implemented, PaddlePaddle will be replace it with a synchronous API. + +### Parameter + +device - the device to be used + +stream - it is executed on that stream. + +dst - the address of destination device memory + +src - the address of source host memory + +size - the size of memory neeeded to be copied (in byte) + +## async_memory_copy_d2h 【optional】 + +### Definition + +```c++ +C_Status (*async_memory_copy_d2h)(const C_Device device, C_Stream stream, void* dst, const void* src, size_t size) +``` + +### Description + +It copies asynchronous memory from device to host. If it is not implemented, PaddlePaddle will replace it with a synchronous API. + +### Parameter + +device - the device to be used + +stream - It is executed on the stream. + +dst - the address of destination host + +src - the address of source device + +size - the size of memory needed to be copied + +## async_memory_copy_d2d 【optional】 + +### Definition + +```c++ +C_Status (*async_memory_copy_d2d)(const C_Device device, C_Stream stream, void* dst, const void* src, size_t size) +``` + +### Description + +It copies asynchronous memory in the device. If it is not implemented, PaddlePaddle will replace it with a synchronous API. + +### Parameter + +device - the device to be used + +stream - the stream to be used + +dst - the address of destination device memory + +src - the address of source device memory + +size - the size of memory needed to be copied (in byte) + +## async_memory_copy_p2p 【optional】 + +### Definition + +```c++ +C_Status (*async_memory_copy_p2p)(const C_Device dst_device, const C_Device src_device, C_Stream stream, void* dst, const void* src, size_t size) +``` + +### Description + +It copies asynchronous memory between devices. If it is not implemented, PaddlePaddle will replace it with a synchronous API. + +### Parameter + +dst_device - the destination device + +src_device - the source device + +stream - the stream to be used + +dst - the address of destination device memory + +src - the address of source device memory + +size - the size of memory needed to be copied (in byte) + +## device_memory_set 【optional】 + +### Definition + +```c++ +C_Status (*device_memory_set)(const C_Device device, void* ptr, unsigned char value, size_t size) +``` + +### Description + +It uses the value to pad the memory of a device. If it is not implemented, PaddlePaddle will take its place with memory_copy_h2d. + +### Parameter + +device - the device to be used + +ptr - the address of the padding + +value - padded value + +size - padding size (in byte) + +## device_memory_stats 【required】 + +### Definition + +```c++ +C_Status (*device_memory_stats)(const C_Device device, size_t* total_memory, size_t* free_memory) +``` + +### Description + +It counts the memory using condition. + +### Parameter + +device - the device to be used + +total_memory - total memory (in byte) + +free_memory - free memory (in byte) + +## device_min_chunk_size 【required】 + +### Definition + +```c++ +C_Status (*device_min_chunk_size)(C_Device device, size_t* size) +``` + +### Description + +It checks the minimum size of device memory chunks (in byte). In order not to call the device API to frequently apply for/ deallocate memory, PaddlePaddle manages the device memory. When there is an application, memory will be first allocated from the managed memory. When there is an application for memory whose size is "size", the size of the allocated memory is size + extra_padding_size and it will be aligned with min_chunk_size, the minimum size of memory chunks. + +### Parameter + +device - the device to be used + +size - the size of the minimum chunk (in byte) + +## device_max_chunk_size 【optional】 + +### Definition + +```c++ +C_Status (*device_max_chunk_size)(C_Device device, size_t* size) +``` + +### Description + +The size of the memory allocated from that managed by PaddlePaddle is no more than the maximum size of device memory chunks (in byte). Otherwise, the device API will be invoked for allocation. If this API is not implemented, the size of the memory is device_max_alloc_size, the maximum size of allocatable device memory. + +### Parameter + +device - the device to be used + +size - the size of the maximum chunk (in byte) + +## device_max_alloc_size 【optional】 + +### Definition + +```c++ +C_Status (*device_max_alloc_size)(C_Device device, size_t* size) +``` + +### Description + +It checks the maximum size (in byte) of allocatable device memory. If it is not implemented, the memory size will be equal to that of the current available memory. + +### Parameter + +device - the device to be used + +size - the maximum size of allocatable memory (in byte) + +## device_extra_padding_size 【optional】 + +### Definition + +```c++ +C_Status (*device_extra_padding_size)(C_Device device, size_t* size) +``` + +### Description + +It allocates the extra padding size of device memory. If it is not implemented, the size will be set to 0 by default. In order not to call the device API to frequently apply for/ deallocate memory, PaddlePaddle manages the device memory. When there is an application, memory will be first allocated from the managed memory. When there is an application for memory whose size is "size", the size of the allocated memory is size + extra_padding_size and it will be aligned with min_chunk_size, the minimum size of memory chunks. + +### Parameter + +device - the device to be used + +size - the extra padding size (in byte) + +## device_init_alloc_size 【optional】 + +### Definition + +```c++ +C_Status (*device_init_alloc_size)(const C_Device device, size_t* size) +``` + +### Description + +It checks the size of allocated device memory (in byte) after initialization. If it is not implemented, the size will be equal to device_max_alloc_size, the maximum size of allocatable device memory. + +### Parameter + +device - the device to be used + +size - the size of first allocated memory (in byte) + +## device_realloc_size 【optional】 + +### Definition + +```c++ +C_Status (*device_realloc_size)(const C_Device device, size_t* size) +``` + +### Description + +It checks the size of reallocated device memory (in byte). If it is not implemented, the memory size will be equal to device_max_alloc_size, the maximum size of allocatable device memory. + +### Parameter + +device - the device to be used + +size - the size of reallocated memory (in byte) diff --git a/docs/dev_guides/custom_device_docs/runtime_data_type_en.md b/docs/dev_guides/custom_device_docs/runtime_data_type_en.md new file mode 100644 index 00000000000..d9c1e708385 --- /dev/null +++ b/docs/dev_guides/custom_device_docs/runtime_data_type_en.md @@ -0,0 +1,131 @@ +# Data Type + +## C_Status + +### Definition + +```c++ +typedef enum { + C_SUCCESS = 0, + C_WARNING, + C_FAILED, + C_ERROR, + C_INTERNAL_ERROR +} C_Status; +``` + +### Description + +C_SUCCESS - The returned value when the execution of the function is a success + +C_WARNING - The returned value when the performance of the funtion falls short of expectations. For example, the asynchronous API is actually synchronous. + +C_FAILED - Resources runs out or the request fails. + +C_ERROR - Parameter error, incorrect usage, or not initialized. + +C_INTERNAL_ERROR - Plug-in internal error + +## C_Device + +### Definition + +```c++ +typedef struct C_Device_st { int id; } * C_Device; +``` + +### Description + +It describes a device. + +## C_Stream + +### Definition + +```c++ +typedef struct C_Stream_st* C_Stream; +``` + +### Description + +It describes a stream, which is used to execute asynchronous tasks within the framework. In the stream, tasks are executed in order. + +## C_Event + +### Definition + +```c++ +typedef struct C_Event_st* C_Event; +``` + +### Description + +It describes an event, which is used to synchronize tasks from different streams within the framework. + +## C_Callback + +### Definition + +```c++ +typedef void (*C_Callback)(C_Device device, + C_Stream stream, + void* user_data, + C_Status* status); +``` + +### Description + +It is the callback function offered by the host and has four parameters: device, stream, user data, and returned value. + +## CustomRuntimeParams + +### Definition + +```c++ +struct CustomRuntimeParams { + size_t size; + C_DeviceInterface* interface; + CustomRuntimeVersion version; + char* device_type; + char* sub_device_type; + char reserved[32]; +}; +``` + +### Description + +They are function parameters of InitPlugin. + +size - the size of CustomRuntimeParams. The size of the framework and the plug-in may be different. You need to first check the size of the plug-in and ensure that memory access does not cross the boundary. It is feasible to use the macro of PADDLE_CUSTOM_RUNTIME_CHECK_VERSION in the check. + +interface - the device callback interface. It is necessary for the plug-in to implement essential APIs and fill the parameter in to finish registration. + +version - the custom runtime version defined in the device_ext.h, which is used to check the version compatibility by the framework. + +device_type - the appellation of the device type, used by the framework to distinguish devices and exposed to the user layer to specify the hardware back end, such as "CustomCPU". + +sub_device_type - the appellation of the sub-device type, used to interpret the plug-in version, such as "V1.0". + +## CustomRuntimeVersion + +### Definition + +```c++ +struct CustomRuntimeVersion { + size_t major, minor, patch; +}; +``` + +### Description + +It is the custom runtime version used by the plug-in. It is used to check the version compatibility by the framework and can be filled up by the macro of PADDLE_CUSTOM_RUNTIME_CHECK_VERSION. + +## C_DeviceInterface + +### Definition + +For detailed definitions of the types of C_DeviceInterface, please refer to [device_ext.h](https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/phi/backends/device_ext.h). + +### Description + +It collects the custom runtime callback APIs. diff --git a/docs/dev_guides/custom_device_docs/stream_api_en.md b/docs/dev_guides/custom_device_docs/stream_api_en.md new file mode 100644 index 00000000000..d0be594acdf --- /dev/null +++ b/docs/dev_guides/custom_device_docs/stream_api_en.md @@ -0,0 +1,115 @@ +# Stream APIs + +## create_stream 【required】 + +### Definition + +```c++ +C_Status (*create_stream)(const C_Device device, C_Stream* stream) +``` + +### Description + +It creats a stream, which is used to execute asynchronous tasks within the framework. In the stream, tasks are done in order. When the device does not support asynchronous execution, the API is required to be implemented with an empty method. + +### Parameter + +device - the device to be used + +stream - the created stream + +## destroy_stream 【required】 + +### Definition + +```c++ +C_Status (*destroy_stream)(const C_Device device, C_Stream stream) +``` + +### Description + +It destroys a stream. When the device does not support asynchronous execution, the API needs to be implemented with an empty method. + +### Parameter + +device - the device to be used + +stream - the stream required to be deallocated + +## query_stream 【optional】 + +### Definition + +```c++ +C_Status (*query_stream)(const C_Device device, C_Stream stream) +``` + +### Description + +It queries whether the tasks on the stream are done. If not implemented, it will be replaced with synchronize_stream by PaddlePaddle. + +### Parameter + +device - the device to be used + +stream - the stream required to be queried. + +## synchronize_stream 【required】 + +### Definition + +```c++ +C_Status (*synchronize_stream)(const C_Device device, C_Stream stream) +``` + +### Description + +It synchronizes the stream and waits for the completion of all tasks on the stream. When the device does not support asynchronous execution, the API is required to be implemented with an empty method. + +### Parameter + +device - the device to be used + +stream - the stream needed to be synchronized + +## stream_add_callback 【optional】 + +### Definition + +```c++ +C_Status (*stream_add_callback)(const C_Device device, C_Stream stream, C_Callback callback, void* user_data) +``` + +### Description + +It adds a host callback function to the stream. + +### Parameter + +device - the device to be used + +stream - the stream where the callback function is added + +callback - the callback function + +user_data - parameters of the function + +## stream_wait_event 【required】 + +### Definition + +```c++ +C_Status (*stream_wait_event)(const C_Device device, C_Stream stream, C_Event event) +``` + +### Description + +It waits for the completion of an event on the stream. When the device does not support asynchronous execution, the API is required to be implemented with an empty method. + +### Parameter + +device - the device to be used + +stream - the stream waited for + +event - the event waited for diff --git a/docs/dev_guides/index_en.rst b/docs/dev_guides/index_en.rst index 22108b5b757..f67b4811c92 100644 --- a/docs/dev_guides/index_en.rst +++ b/docs/dev_guides/index_en.rst @@ -16,3 +16,4 @@ Similarly, if you feel that this document is missing, or that the description is local_dev_guide_en.md submit_pr_guide_en.md code_review_en.md + custom_device_docs/index_en.rst diff --git a/docs/index_en.rst b/docs/index_en.rst index 21bd05d332d..6d864cbd80d 100644 --- a/docs/index_en.rst +++ b/docs/index_en.rst @@ -7,4 +7,5 @@ install/index_en.rst guides/index_en.rst api/index_en.rst + dev_guides/index_en.rst release_note_en.md From 78315378696fedf8c35ce3bac931ec5f6a44de20 Mon Sep 17 00:00:00 2001 From: Chen Long <1300851984@qq.com> Date: Fri, 29 Apr 2022 17:31:19 +0800 Subject: [PATCH 2/5] Update index_en.rst --- docs/dev_guides/index_en.rst | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/docs/dev_guides/index_en.rst b/docs/dev_guides/index_en.rst index f67b4811c92..df53d1d78cd 100644 --- a/docs/dev_guides/index_en.rst +++ b/docs/dev_guides/index_en.rst @@ -1,19 +1,17 @@ -######## +############################# Contribution Guidelines -######## +############################# We very much welcome you to participate in the construction of the paddle. The following content is dedicated to explaining all the ways to join the paddle, and try to help you contribute to the paddle smoothly. Similarly, if you feel that this document is missing, or that the description is unclear, we also welcome you to contribute to this series of documents. - `Overview <./Overview_en.html>`_ : Contribution guidelines overview. +- `custom_device_docs <./custom_device_docs/index_en.html>`_ : Contribution guidelines overview. .. toctree:: :hidden: Overview_en.md - local_dev_guide_en.md - submit_pr_guide_en.md - code_review_en.md custom_device_docs/index_en.rst From 56a90b77ac6da230cca3282dc8e8fd8b213e1c0a Mon Sep 17 00:00:00 2001 From: Chen Long <1300851984@qq.com> Date: Fri, 29 Apr 2022 18:30:24 +0800 Subject: [PATCH 3/5] Update index_en.rst --- docs/dev_guides/index_en.rst | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/dev_guides/index_en.rst b/docs/dev_guides/index_en.rst index df53d1d78cd..6bc18c7ae8f 100644 --- a/docs/dev_guides/index_en.rst +++ b/docs/dev_guides/index_en.rst @@ -6,12 +6,10 @@ We very much welcome you to participate in the construction of the paddle. The f Similarly, if you feel that this document is missing, or that the description is unclear, we also welcome you to contribute to this series of documents. -- `Overview <./Overview_en.html>`_ : Contribution guidelines overview. - `custom_device_docs <./custom_device_docs/index_en.html>`_ : Contribution guidelines overview. .. toctree:: :hidden: - Overview_en.md custom_device_docs/index_en.rst From 97686dd37e9cd854ab58920609e6250a2277a04d Mon Sep 17 00:00:00 2001 From: zhangkeliang Date: Thu, 5 May 2022 07:36:24 +0000 Subject: [PATCH 4/5] Fix _cn to _en, and ##### to short --- .../custom_device_example_en.md | 2 +- .../custom_kernel_docs/cpp_api_en.rst | 4 ++-- .../custom_device_docs/custom_kernel_en.rst | 12 +++++----- .../custom_device_docs/custom_runtime_en.rst | 24 +++++++++---------- .../custom_device_docs/index_en.rst | 6 ++--- 5 files changed, 24 insertions(+), 24 deletions(-) diff --git a/docs/dev_guides/custom_device_docs/custom_device_example_en.md b/docs/dev_guides/custom_device_docs/custom_device_example_en.md index 6331865486e..a1d35cbd39e 100644 --- a/docs/dev_guides/custom_device_docs/custom_device_example_en.md +++ b/docs/dev_guides/custom_device_docs/custom_device_example_en.md @@ -59,7 +59,7 @@ Then, the plug-in should fill in its basic information and version number, which - params->device_type : the appellation of the device backend. If there is another plug-in with the same name, the runtime will not be registered. - params->sub_device_type : the appellation of the sub-type of the device backend -Finally, some callback APIs in params->interface should be filled by the plug-in (At least the required APIs should be implemented, or the runtime will not be registered otherwise). Thus, the custom runtime can be initialized. For details of the APIS, please refer to [Custom Runtime Document](./custom_runtime_cn.html)。 +Finally, some callback APIs in params->interface should be filled by the plug-in (At least the required APIs should be implemented, or the runtime will not be registered otherwise). Thus, the custom runtime can be initialized. For details of the APIS, please refer to [Custom Runtime Document](./custom_runtime_en.html)。 ```c++ static size_t global_total_mem_size = 1 * 1024 * 1024 * 1024UL; diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst b/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst index 0e91dae3fa7..55bfaf8e652 100644 --- a/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/cpp_api_en.rst @@ -1,6 +1,6 @@ -############# +############################# Kernel Implementation APIs -############# +############################# The custom kernel-function implementation mainly depends on two parts: 1.APIs released by PaddlePaddle, including the context API, the tensor API, and the exception API; 2. APIs of the device encapsulation library. And the C++ API of PaddlePaddle has been released by the header file. diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_en.rst b/docs/dev_guides/custom_device_docs/custom_kernel_en.rst index f1dabbb80d6..5a64ac842f2 100644 --- a/docs/dev_guides/custom_device_docs/custom_kernel_en.rst +++ b/docs/dev_guides/custom_device_docs/custom_kernel_en.rst @@ -6,14 +6,14 @@ The custom kernel is the implementation of corresponding operators of the kernel The implementation of the custom kernel is based on the public kernel statement of PaddlePaddle, and public C++ API and register macro. -- `Kernel function statement <./custom_kernel_docs/kernel_declare_cn.html>`_ : to introduce the kernel statement of PaddlePaddle -- `Kernel implementation API <./custom_kernel_docs/cpp_api_cn.html>`_ : to introduce the C++ API required in the implementation of the custom function. -- `Kernel register API <./custom_kernel_docs/register_api_cn.html>`_ : to introduce the register macro of the custom kernel. +- `Kernel function statement <./custom_kernel_docs/kernel_declare_en.html>`_ : to introduce the kernel statement of PaddlePaddle +- `Kernel implementation API <./custom_kernel_docs/cpp_api_en.html>`_ : to introduce the C++ API required in the implementation of the custom function. +- `Kernel register API <./custom_kernel_docs/register_api_en.html>`_ : to introduce the register macro of the custom kernel. .. toctree:: :hidden: - custom_kernel_docs/kernel_declare_cn.md - custom_kernel_docs/cpp_api_cn.rst - custom_kernel_docs/register_api_cn.md + custom_kernel_docs/kernel_declare_en.md + custom_kernel_docs/cpp_api_en.rst + custom_kernel_docs/register_api_en.md diff --git a/docs/dev_guides/custom_device_docs/custom_runtime_en.rst b/docs/dev_guides/custom_device_docs/custom_runtime_en.rst index 9f4779c7e4c..c036565be6d 100644 --- a/docs/dev_guides/custom_device_docs/custom_runtime_en.rst +++ b/docs/dev_guides/custom_device_docs/custom_runtime_en.rst @@ -1,14 +1,14 @@ -############# +############################# Custom Runtime -############# +############################# Custom Runtime offers a new method to register the runtime of new devices via plug-ins. Responsible for the management of PaddlePaddle devices and Runtime/Driver API, DeviceManager provides a uniform API for the framework to invoke device capabilities, offers a series of APIs to register Custom Runtime, and ensure that the binary system is compatible through C API. The APIs can be found in `device_ext.h `_ . Developers can add custom runtime for PaddlePaddle only by implementing these APIs. -- `Data type <./runtime_data_type_cn.html>`_ : to introduce definitions of data types of custom runtime. -- `Device API <./device_api_cn.html>`_ : to introduce definitions and functions of Device APIs. -- `Memory API <./memory_api_cn.html>`_ : to introduce definitions and functions of Memory APIs. -- `Stream API <./stream_api_cn.html>`_ : to introduce definitions and functions of Stream APIs. -- `Event API <./event_api_cn.html>`_ : to introduce definitions and functions of Event APIs. +- `Data type <./runtime_data_type_en.html>`_ : to introduce definitions of data types of custom runtime. +- `Device API <./device_api_en.html>`_ : to introduce definitions and functions of Device APIs. +- `Memory API <./memory_api_en.html>`_ : to introduce definitions and functions of Memory APIs. +- `Stream API <./stream_api_en.html>`_ : to introduce definitions and functions of Stream APIs. +- `Event API <./event_api_en.html>`_ : to introduce definitions and functions of Event APIs. Device APIs @@ -136,9 +136,9 @@ Event APIs .. toctree:: :hidden: - runtime_data_type_cn.md - device_api_cn.md - memory_api_cn.md - stream_api_cn.md - event_api_cn.md + runtime_data_type_en.md + device_api_en.md + memory_api_en.md + stream_api_en.md + event_api_en.md diff --git a/docs/dev_guides/custom_device_docs/index_en.rst b/docs/dev_guides/custom_device_docs/index_en.rst index 30a1bc77e03..52e1efc53b5 100644 --- a/docs/dev_guides/custom_device_docs/index_en.rst +++ b/docs/dev_guides/custom_device_docs/index_en.rst @@ -1,6 +1,6 @@ -#################### +############################# Custom Device Access Guide -#################### +############################# The custom device access decouples the framework from the device and makes it available to extend the backend of the PaddlePaddle device via plug-ins. In this way, developers can make a plug-in for PaddlePaddle only by implementing the standard API and compiling it into a dynamic-link library, instead of by modifying the code of PaddlePaddle. Now it is easier to develop hardware backends for PaddlePaddle. @@ -8,7 +8,7 @@ The custom device access is composed of custom runtime and custom kernel. With t - `Custom Runtime <./custom_runtime_en.html>`_ : Introduction of custom runtime of the PaddlePaddle framework - `Custom Kernel <./custom_kernel_en.html>`_ : Introduction of custom kernel of the PaddlePaddle framework -- `Example of Device Access <./custom_kernel_en.html>`_ : To demonstrate how to connect new custom devices to PaddlePaddle +- `Example of Device Access <./custom_device_example_en.html>`_ : To demonstrate how to connect new custom devices to PaddlePaddle .. toctree:: :hidden: From e32f790dc74165ea1044c380e0c2ac2357a30a21 Mon Sep 17 00:00:00 2001 From: zhangkeliang Date: Thu, 5 May 2022 12:53:14 +0000 Subject: [PATCH 5/5] optimieze --- .../custom_device_docs/custom_device_example_cn.md | 1 - .../custom_device_docs/custom_device_example_en.md | 5 ++--- .../custom_kernel_docs/register_api_en.md | 2 +- docs/dev_guides/custom_device_docs/index_en.rst | 8 ++++---- 4 files changed, 7 insertions(+), 9 deletions(-) diff --git a/docs/dev_guides/custom_device_docs/custom_device_example_cn.md b/docs/dev_guides/custom_device_docs/custom_device_example_cn.md index 72e79a74f58..9d064bd3702 100644 --- a/docs/dev_guides/custom_device_docs/custom_device_example_cn.md +++ b/docs/dev_guides/custom_device_docs/custom_device_example_cn.md @@ -5,7 +5,6 @@ > 注意: > - 请确保已经正确安装了[飞桨develop](https://github.com/PaddlePaddle/Paddle)最新版本 > - 当前仅支持 `Linux`平台,示例中使用X86_64平台 -> - 支持飞桨已通过头文件开放函数式声明的Kernel自定义编码与注册 ## 第一步:实现自定义 Runtime diff --git a/docs/dev_guides/custom_device_docs/custom_device_example_en.md b/docs/dev_guides/custom_device_docs/custom_device_example_en.md index a1d35cbd39e..864d2b036aa 100644 --- a/docs/dev_guides/custom_device_docs/custom_device_example_en.md +++ b/docs/dev_guides/custom_device_docs/custom_device_example_en.md @@ -1,11 +1,10 @@ -# Example of Device Access +# CustomDevice Example -This section will talk about how to implement a CustomDevice plug-in and add a new device backend for PaddlePaddle. How to compile, package, install, and use the backend will also be introduced. +In this section we will walk through the steps required to extend a fake hardware backend for PaddlePaddle by implementing a fake device named CustomCPU. > Note: > - Please make sure that you have correctly installed the latest version of [Paddle develop](https://github.com/PaddlePaddle/Paddle). > - Only `Linux` is supported -> - PaddlePaddle can have the custom kernel code and registration of open functional statements in heder files. ## Step One: Implement Custom Runtime diff --git a/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md b/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md index 1ab121aa42d..163f1b91964 100644 --- a/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md +++ b/docs/dev_guides/custom_device_docs/custom_kernel_docs/register_api_en.md @@ -58,5 +58,5 @@ PD_REGISTER_PLUGIN_KERNEL(softmax, ``` > Note: -> 1. When the backend is accessed through the custom runtime, the backend parameter must be the same as its name. +> 1. When the backend is registered through the custom runtime, the backend parameter must be the same as its name. > 2. Except the requirement of the end function body of the registration macro,keep the empty function body. You can refer to other backends within the PaddlePaddle framework. diff --git a/docs/dev_guides/custom_device_docs/index_en.rst b/docs/dev_guides/custom_device_docs/index_en.rst index 52e1efc53b5..49fca8ddf80 100644 --- a/docs/dev_guides/custom_device_docs/index_en.rst +++ b/docs/dev_guides/custom_device_docs/index_en.rst @@ -1,14 +1,14 @@ ############################# -Custom Device Access Guide +Custom Device Support ############################# -The custom device access decouples the framework from the device and makes it available to extend the backend of the PaddlePaddle device via plug-ins. In this way, developers can make a plug-in for PaddlePaddle only by implementing the standard API and compiling it into a dynamic-link library, instead of by modifying the code of PaddlePaddle. Now it is easier to develop hardware backends for PaddlePaddle. +The custom device function decouples the framework from the device and makes it available to extend the backend of the PaddlePaddle device via plug-ins. In this way, developers can make a plug-in for PaddlePaddle only by implementing the standard API and compiling it into a dynamic-link library, instead of by modifying the code of PaddlePaddle. Now it is easier to develop hardware backends for PaddlePaddle. -The custom device access is composed of custom runtime and custom kernel. With the two modules, users can connect new custom devices to PaddlePaddle according to their own needs. +The custom device function is composed of custom runtime and custom kernel. With the two modules, users can connect new custom devices to PaddlePaddle according to their own needs. - `Custom Runtime <./custom_runtime_en.html>`_ : Introduction of custom runtime of the PaddlePaddle framework - `Custom Kernel <./custom_kernel_en.html>`_ : Introduction of custom kernel of the PaddlePaddle framework -- `Example of Device Access <./custom_device_example_en.html>`_ : To demonstrate how to connect new custom devices to PaddlePaddle +- `CustomDevice Example <./custom_device_example_en.html>`_ : The tutorial of add a new custom device to PaddlePaddle .. toctree:: :hidden: