Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,60 @@ python run-clang-format.py \

where `$CLANG_FORMAT` denotes the path to the downloaded binary.

## Adding Third Party Libraries

The following steps outline how to add third party libraries to torchtext. We assume that the third party library has
correctly setup their `CMakeLists.txt` file for other libraries to take a dependency on.

1. Add the third party library as a submodule. Here is a great
[tutorial](https://www.atlassian.com/git/tutorials/git-submodule) on working with submodules in git.
- Navigate to `third_party/` folder and run `git submodule add <repo-URL>`
- Verify the newly added module is present in the
[`.gitmodules`](https://github.com/pytorch/text/blob/main/.gitmodules) file
2. Update
[`third_party/CMakeLists.txt`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/third_party/CMakeLists.txt#L8)
to add the following line: `add_subdirectory(<name-of-submodule-folder> EXCLUDE_FROM_ALL)`
3. (Optional) If any of the files within the `csrc/` folder make use of the newly added third party library then
- Add the new submodule folder to
[`​​LIBTORCHTEXT_INCLUDE_DIRS`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L24)
and to
[`EXTENSION_INCLUDE_DIRS`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L119)
- Add the "targets" name defined by the third party library's `CMakeLists.txt` file to
[`LIBTORCHTEXT_LINK_LIBRARIES`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L33)
- Note that the third party libraries are linked statically with torchtext
4. Verify the torchtext build works by running `python setup.py develop`

## Adding a Custom C++ Operator

Custom C++ operators can be implemented and registered in torchtext for several reasons including to make an existing
Python component more efficient, and to get around the limitations when working with multithreading in Python (due to
the Global Interpreter Lock). These custom kernels (or “ops”) can be embedded into a TorchScripted model and can be
executed both in Python and in their serialized form directly in C++. You can learn more in this
[tutorial on writing custom C++ operators](https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html)

Steps to register an operator:

1. Add the new custom operator to the [`torchtext/csrc`](https://github.com/pytorch/text/tree/main/torchtext/csrc)
folder. This entails writing the header and the source file for the custom op.
2. Add the new source files to the
[`LIBTORCHTEXT_SOURCES`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L11)
list.
3. Register the operators with torchbind and pybind
- Torchbind registration happens in the
[`register_torchbindings.cpp`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_torchbindings.cpp#L14)
file
- Pybind registration happens in the
[`register_pybindings.cpp`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_pybindings.cpp#L34)
file.
4. Write a Python wrapper class that is responsible for exposing the torchbind/pybind registered operators via Python.
You can find some examples of this in the
[`torchtext/transforms.py`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/transforms.py#L274)
file.
5. Write a unit test that tests the functionality of the operator through the Python wrapper class. You can find some
examples in the
[`test/test_transforms.py`](https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/test/test_transforms.py#L317)
file.

## Contributor License Agreement ("CLA")

In order to accept your pull request, we need you to submit a CLA. You only need to do this once to work on any of
Expand Down