@@ -48,6 +48,60 @@ python run-clang-format.py \
4848
4949where ` $CLANG_FORMAT ` denotes the path to the downloaded binary.
5050
51+ ## Adding Third Party Libraries
52+
53+ The following steps outline how to add third party libraries to torchtext. We assume that the third party library has
54+ correctly setup their ` CMakeLists.txt ` file for other libraries to take a dependency on.
55+
56+ 1 . Add the third party library as a submodule. Here is a great
57+ [ tutorial] ( https://www.atlassian.com/git/tutorials/git-submodule ) on working with submodules in git.
58+ - Navigate to ` third_party/ ` folder and run ` git submodule add <repo-URL> `
59+ - Verify the newly added module is present in the
60+ [ ` .gitmodules ` ] ( https://github.com/pytorch/text/blob/main/.gitmodules ) file
61+ 2 . Update
62+ [ ` third_party/CMakeLists.txt ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/third_party/CMakeLists.txt#L8 )
63+ to add the following line: ` add_subdirectory(<name-of-submodule-folder> EXCLUDE_FROM_ALL) `
64+ 3 . (Optional) If any of the files within the ` csrc/ ` folder make use of the newly added third party library then
65+ - Add the new submodule folder to
66+ [ ` LIBTORCHTEXT_INCLUDE_DIRS ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L24 )
67+ and to
68+ [ ` EXTENSION_INCLUDE_DIRS ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L119 )
69+ - Add the "targets" name defined by the third party library's ` CMakeLists.txt ` file to
70+ [ ` LIBTORCHTEXT_LINK_LIBRARIES ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L33 )
71+ - Note that the third party libraries are linked statically with torchtext
72+ 4 . Verify the torchtext build works by running ` python setup.py develop `
73+
74+ ## Adding a Custom C++ Operator
75+
76+ Custom C++ operators can be implemented and registered in torchtext for several reasons including to make an existing
77+ Python component more efficient, and to get around the limitations when working with multithreading in Python (due to
78+ the Global Interpreter Lock). These custom kernels (or “ops”) can be embedded into a TorchScripted model and can be
79+ executed both in Python and in their serialized form directly in C++. You can learn more in this
80+ [ tutorial on writing custom C++ operators] ( https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html )
81+
82+ Steps to register an operator:
83+
84+ 1 . Add the new custom operator to the [ ` torchtext/csrc ` ] ( https://github.com/pytorch/text/tree/main/torchtext/csrc )
85+ folder. This entails writing the header and the source file for the custom op.
86+ 2 . Add the new source files to the
87+ [ ` LIBTORCHTEXT_SOURCES ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/CMakeLists.txt#L11 )
88+ list.
89+ 3 . Register the operators with torchbind and pybind
90+ - Torchbind registration happens in the
91+ [ ` register_torchbindings.cpp ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_torchbindings.cpp#L14 )
92+ file
93+ - Pybind registration happens in the
94+ [ ` register_pybindings.cpp ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/csrc/register_pybindings.cpp#L34 )
95+ file.
96+ 4 . Write a Python wrapper class that is responsible for exposing the torchbind/pybind registered operators via Python.
97+ You can find some examples of this in the
98+ [ ` torchtext/transforms.py ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/torchtext/transforms.py#L274 )
99+ file.
100+ 5 . Write a unit test that tests the functionality of the operator through the Python wrapper class. You can find some
101+ examples in the
102+ [ ` test/test_transforms.py ` ] ( https://github.com/pytorch/text/blob/70fc1040ee40faf129604557107cc59fd51c4fe2/test/test_transforms.py#L317 )
103+ file.
104+
51105## Contributor License Agreement ("CLA")
52106
53107In order to accept your pull request, we need you to submit a CLA. You only need to do this once to work on any of
0 commit comments