Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit cb8475e

Browse files
authored
Add contributing guidelines for third party and custom C++ operators (#1742)
* Add contributing guidelines for third party libraries and custom C++ operators * Fix formatting * Fixing PR comments * Resolve PR comment
1 parent 814aa7e commit cb8475e

File tree

1 file changed

+54
-0
lines changed

1 file changed

+54
-0
lines changed

CONTRIBUTING.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,60 @@ python run-clang-format.py \
4848

4949
where `$CLANG_FORMAT` denotes the path to the downloaded binary.
5050

51+
## Adding Third Party Libraries
52+
53+
The following steps outline how to add third party libraries to torchtext. We assume that the third party library has
54+
correctly setup their `CMakeLists.txt` file for other libraries to take a dependency on.
55+
56+
1. Add the third party library as a submodule. Here is a great
57+
[tutorial](https://www.atlassian.com/git/tutorials/git-submodule) on working with submodules in git.
58+
- Navigate to `third_party/` folder and run `git submodule add <repo-URL>`
59+
- Verify the newly added module is present in the
60+
[`.gitmodules`](https://github.com/pytorch/text/blob/main/.gitmodules) file
61+
2. Update
62+
[`third_party/CMakeLists.txt`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/third_party/CMakeLists.txt#L8)
63+
to add the following line: `add_subdirectory(<name-of-submodule-folder> EXCLUDE_FROM_ALL)`
64+
3. (Optional) If any of the files within the `csrc/` folder make use of the newly added third party library then
65+
- Add the new submodule folder to
66+
[`​​LIBTORCHTEXT_INCLUDE_DIRS`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L24)
67+
and to
68+
[`EXTENSION_INCLUDE_DIRS`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L119)
69+
- Add the "targets" name defined by the third party library's `CMakeLists.txt` file to
70+
[`LIBTORCHTEXT_LINK_LIBRARIES`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L33)
71+
- Note that the third party libraries are linked statically with torchtext
72+
4. Verify the torchtext build works by running `python setup.py develop`
73+
74+
## Adding a Custom C++ Operator
75+
76+
Custom C++ operators can be implemented and registered in torchtext for several reasons including to make an existing
77+
Python component more efficient, and to get around the limitations when working with multithreading in Python (due to
78+
the Global Interpreter Lock). These custom kernels (or “ops”) can be embedded into a TorchScripted model and can be
79+
executed both in Python and in their serialized form directly in C++. You can learn more in this
80+
[tutorial on writing custom C++ operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html)
81+
82+
Steps to register an operator:
83+
84+
1. Add the new custom operator to the [`torchtext/csrc`](https://github.com/pytorch/text/tree/main/torchtext/csrc)
85+
folder. This entails writing the header and the source file for the custom op.
86+
2. Add the new source files to the
87+
[`LIBTORCHTEXT_SOURCES`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L11)
88+
list.
89+
3. Register the operators with torchbind and pybind
90+
- Torchbind registration happens in the
91+
[`register_torchbindings.cpp`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_torchbindings.cpp#L14)
92+
file
93+
- Pybind registration happens in the
94+
[`register_pybindings.cpp`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_pybindings.cpp#L34)
95+
file.
96+
4. Write a Python wrapper class that is responsible for exposing the torchbind/pybind registered operators via Python.
97+
You can find some examples of this in the
98+
[`torchtext/transforms.py`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/transforms.py#L274)
99+
file.
100+
5. Write a unit test that tests the functionality of the operator through the Python wrapper class. You can find some
101+
examples in the
102+
[`test/test_transforms.py`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/test/test_transforms.py#L317)
103+
file.
104+
51105
## Contributor License Agreement ("CLA")
52106

53107
In order to accept your pull request, we need you to submit a CLA. You only need to do this once to work on any of

0 commit comments

Comments
 (0)