-
Notifications
You must be signed in to change notification settings - Fork 814
Add libtorchtext cpp example #1817
Conversation
|
@parmeet one thing I wanted to double check with you is whether to reuse the |
mthrok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpt2_bpe_vocab.bpe and gpt2_bpe_encoder.json seem to be the second set of the same assets added in https://github.com/pytorch/text/tree/main/test/asset
This kind of trend is one of the reasons why I was suggesting not to check-in such a huge asset. #1462 (comment)
|
I think instead we could just ask users to download the artifacts in readme? I agree with @mthrok to avoid check-in artifacts.
|
Thanks for the feedback @mthrok and @parmeet. Just removed the assets and added instructions on how to download it. |
parmeet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Nayef211 for adding this example application, LGTM!!
|
Hi @Nayef211 , I was wondering if we need a CMakeLists.txt in |
@JiedaokouWangguan thanks for calling out this issue. I think it should be resolved by #1908. Would you be able to check out the PR locally and see if you are able to get the example working for you? |
|
Thanks for the quick fix! Let me have a try |
Reference Issue #1644
Description
libtorchtextandlibtorchlibraries in a C++ applicationGPT2BPETokenizerand shows that the tokenization results are consistent across Python and C++The example is inspired by the libtorchaudio examples from the torchaudio repo and the tokenizer example from @mreso