Skip to content
This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Commit 579c519

Browse files
parmeetfacebook-github-bot
authored andcommitted
Import torchtext #1410 0930843
Summary: Import latest from github Reviewed By: Nayef211 Differential Revision: D31745899 fbshipit-source-id: e4ac5c337bcbd1a8809544add7679dd3da242999
1 parent ae13bc6 commit 579c519

File tree

27 files changed

+202
-1329
lines changed

27 files changed

+202
-1329
lines changed

.circleci/config.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ smoke_test_common: &smoke_test_common
8383
jobs:
8484
circleci_consistency:
8585
docker:
86-
- image: circleci/python:3.8
86+
- image: cimg/python:3.8
8787
steps:
8888
- checkout
8989
- run:
@@ -234,7 +234,7 @@ jobs:
234234
# Requires org-member context
235235
binary_wheel_upload:
236236
docker:
237-
- image: circleci/python:3.8
237+
- image: cimg/python:3.8
238238
steps:
239239
- attach_workspace:
240240
at: ~/workspace
@@ -497,7 +497,7 @@ jobs:
497497
- v1-windows-dataset-vector-{{ checksum ".cachekey" }}
498498
- v1-windows-dataset-{{ checksum ".cachekey" }}
499499

500-
500+
501501
- run:
502502
name: Run tests
503503
# Downloading embedding vector takes long time.

.circleci/config.yml.in

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ smoke_test_common: &smoke_test_common
8383
jobs:
8484
circleci_consistency:
8585
docker:
86-
- image: circleci/python:3.8
86+
- image: cimg/python:3.8
8787
steps:
8888
- checkout
8989
- run:
@@ -234,7 +234,7 @@ jobs:
234234
# Requires org-member context
235235
binary_wheel_upload:
236236
docker:
237-
- image: circleci/python:3.8
237+
- image: cimg/python:3.8
238238
steps:
239239
- attach_workspace:
240240
at: ~/workspace
@@ -497,7 +497,7 @@ jobs:
497497
- v1-windows-dataset-vector-{{ checksum ".cachekey" }}
498498
- v1-windows-dataset-{{ checksum ".cachekey" }}
499499
{% endraw %}
500-
500+
501501
- run:
502502
name: Run tests
503503
# Downloading embedding vector takes long time.

.circleci/unittest/linux/scripts/install.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ conda activate ./env
1313
printf "* Installing PyTorch\n"
1414
conda install -y -c "pytorch-${UPLOAD_CHANNEL}" ${CONDA_CHANNEL_FLAGS} pytorch cpuonly
1515

16+
printf "Installing torchdata from source\n"
17+
pip install git+https://github.com/pytorch/data.git
18+
1619
printf "* Installing torchtext\n"
1720
git submodule update --init --recursive
1821
python setup.py develop

.circleci/unittest/windows/scripts/install.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@ conda activate ./env
1818
printf "* Installing PyTorch\n"
1919
conda install -y -c "pytorch-${UPLOAD_CHANNEL}" ${CONDA_CHANNEL_FLAGS} pytorch cpuonly
2020

21+
printf "Installing torchdata from source\n"
22+
pip install git+https://github.com/pytorch/data.git
23+
2124
printf "* Installing torchtext\n"
2225
git submodule update --init --recursive
2326
"$root_dir/packaging/vc_env_helper.bat" python setup.py develop

README.rst

Lines changed: 22 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. image:: https://circleci.com/gh/pytorch/text.svg?style=svg
22
:target: https://circleci.com/gh/pytorch/text
33

4-
.. image:: https://codecov.io/gh/pytorch/text/branch/master/graph/badge.svg
4+
.. image:: https://codecov.io/gh/pytorch/text/branch/main/graph/badge.svg
55
:target: https://codecov.io/gh/pytorch/text
66

77
.. image:: https://img.shields.io/badge/dynamic/json.svg?label=docs&url=https%3A%2F%2Fpypi.org%2Fpypi%2Ftorchtext%2Fjson&query=%24.info.version&colorB=brightgreen&prefix=v
@@ -12,13 +12,13 @@ torchtext
1212

1313
This repository consists of:
1414

15-
* `torchtext.datasets <https://github.com/pytorch/text/tree/master/torchtext/datasets>`_: The raw text iterators for common NLP datasets
16-
* `torchtext.data <https://github.com/pytorch/text/tree/master/torchtext/data>`_: Some basic NLP building blocks (tokenizers, metrics, functionals etc.)
17-
* `torchtext.nn <https://github.com/pytorch/text/tree/master/torchtext/nn>`_: NLP related modules
18-
* `torchtext.vocab <https://github.com/pytorch/text/tree/master/torchtext/vocab.py>`_: Vocab and Vectors related classes and factory functions
19-
* `examples <https://github.com/pytorch/text/tree/master/examples>`_: Example NLP workflows with PyTorch and torchtext library.
15+
* `torchtext.datasets <https://github.com/pytorch/text/tree/main/torchtext/datasets>`_: The raw text iterators for common NLP datasets
16+
* `torchtext.data <https://github.com/pytorch/text/tree/main/torchtext/data>`_: Some basic NLP building blocks (tokenizers, metrics, functionals etc.)
17+
* `torchtext.nn <https://github.com/pytorch/text/tree/main/torchtext/nn>`_: NLP related modules
18+
* `torchtext.vocab <https://github.com/pytorch/text/tree/main/torchtext/vocab.py>`_: Vocab and Vectors related classes and factory functions
19+
* `examples <https://github.com/pytorch/text/tree/main/examples>`_: Example NLP workflows with PyTorch and torchtext library.
2020

21-
Note: The legacy code discussed in `torchtext v0.7.0 release note <https://github.com/pytorch/text/releases/tag/v0.7.0-rc3>`_ has been retired to `torchtext.legacy <https://github.com/pytorch/text/tree/master/torchtext/legacy>`_ folder. Those legacy code will not be maintained by the development team, and we plan to fully remove them in the future release. See `torchtext.legacy <https://github.com/pytorch/text/tree/master/torchtext/legacy>`_ folder for more details.
21+
Note: The legacy code discussed in `torchtext v0.7.0 release note <https://github.com/pytorch/text/releases/tag/v0.7.0-rc3>`_ has been retired to `torchtext.legacy <https://github.com/pytorch/text/tree/main/torchtext/legacy>`_ folder. Those legacy code will not be maintained by the development team, and we plan to fully remove them in the future release. See `torchtext.legacy <https://github.com/pytorch/text/tree/main/torchtext/legacy>`_ folder for more details.
2222

2323
Installation
2424
============
@@ -29,14 +29,15 @@ We recommend Anaconda as a Python package management system. Please refer to `py
2929
:header: "PyTorch version", "torchtext version", "Supported Python version"
3030
:widths: 10, 10, 10
3131

32-
nightly build, master, 3.6+
33-
1.9, 0.10, 3.6+
34-
1.8, 0.9, 3.6+
35-
1.7, 0.8, 3.6+
36-
1.6, 0.7, 3.6+
37-
1.5, 0.6, 3.5+
38-
1.4, 0.5, "2.7, 3.5+"
39-
0.4 and below, 0.2.3, "2.7, 3.5+"
32+
nightly build, main, ">=3.6, <=3.9"
33+
1.9, 0.10, ">=3.6, <=3.9"
34+
1.8, 0.9, ">=3.6, <=3.9"
35+
1.7.1, 0.8.1, ">=3.6, <=3.9"
36+
1.7, 0.8, ">=3.6, <=3.8"
37+
1.6, 0.7, ">=3.6, <=3.8"
38+
1.5, 0.6, ">=3.5, <=3.8"
39+
1.4, 0.5, "2.7, >=3.5, <=3.8"
40+
0.4 and below, 0.2.3, "2.7, >=3.5, <=3.8"
4041

4142
Using conda::
4243

@@ -82,7 +83,7 @@ To build torchtext from source, you need ``git``, ``CMake`` and C++11 compiler s
8283
**Note**
8384

8485
When building from source, make sure that you have the same C++ compiler as the one used to build PyTorch. A simple way is to build PyTorch from source and use the same environment to build torchtext.
85-
If you are using the nightly build of PyTorch, checkout the environment it was built with `conda (here) <https://github.com/pytorch/builder/tree/master/conda>`_ and `pip (here) <https://github.com/pytorch/builder/tree/master/manywheel>`_.
86+
If you are using the nightly build of PyTorch, checkout the environment it was built with `conda (here) <https://github.com/pytorch/builder/tree/main/conda>`_ and `pip (here) <https://github.com/pytorch/builder/tree/main/manywheel>`_.
8687

8788
Documentation
8889
=============
@@ -130,8 +131,8 @@ To get started with torchtext, users may refer to the following tutorials availa
130131

131132
We have re-written several building blocks under ``torchtext.experimental``:
132133

133-
* `Transforms <https://github.com/pytorch/text/blob/master/torchtext/experimental/transforms.py>`_: some basic data processing building blocks
134-
* `Vectors <https://github.com/pytorch/text/blob/master/torchtext/experimental/vectors.py>`_: the vectors to convert tokens into tensors.
134+
* `Transforms <https://github.com/pytorch/text/blob/main/torchtext/experimental/transforms.py>`_: some basic data processing building blocks
135+
* `Vectors <https://github.com/pytorch/text/blob/main/torchtext/experimental/vectors.py>`_: the vectors to convert tokens into tensors.
135136

136137
These prototype building blocks in the experimental folder are available in the nightly release only. The nightly packages are accessible via Pip and Conda for Windows, Mac, and Linux. For example, Linux users can install the nightly wheels with the following command::
137138

@@ -142,7 +143,7 @@ For more detailed instructions, please refer to `Install PyTorch <https://pytorc
142143
[BC Breaking] Legacy
143144
====================
144145

145-
In the v0.9.0 release, we moved the following legacy code to `torchtext.legacy <https://github.com/pytorch/text/tree/master/torchtext/legacy>`_. This is part of the work to revamp the torchtext library and the motivation has been discussed in `Issue #664 <https://github.com/pytorch/text/issues/664>`_:
146+
In the v0.9.0 release, we moved the following legacy code to `torchtext.legacy <https://github.com/pytorch/text/tree/main/torchtext/legacy>`_. This is part of the work to revamp the torchtext library and the motivation has been discussed in `Issue #664 <https://github.com/pytorch/text/issues/664>`_:
146147

147148
* ``torchtext.legacy.data.field``
148149
* ``torchtext.legacy.data.batch``
@@ -151,9 +152,9 @@ In the v0.9.0 release, we moved the following legacy code to `torchtext.legacy <
151152
* ``torchtext.legacy.data.pipeline``
152153
* ``torchtext.legacy.datasets``
153154

154-
We have a `migration tutorial <https://colab.research.google.com/github/pytorch/text/blob/master/examples/legacy_tutorial/migration_tutorial.ipynb>`_ to help users switch to the torchtext datasets in ``v0.9.0`` release. For the users who still want the legacy components, they can add ``legacy`` to the import path.
155+
We have a `migration tutorial <https://colab.research.google.com/github/pytorch/text/blob/main/examples/legacy_tutorial/migration_tutorial.ipynb>`_ to help users switch to the torchtext datasets in ``v0.9.0`` release. For the users who still want the legacy components, they can add ``legacy`` to the import path.
155156

156-
In the v0.10.0 release, we retire the Vocab class to `torchtext.legacy <https://github.com/pytorch/text/tree/master/torchtext/legacy>`_. Users can still access the legacy Vocab via ``torchtext.legacy.vocab``. This class has been replaced by a Vocab module that is backed by efficient C++ implementation and provides common functional APIs for NLP workflows.
157+
In the v0.10.0 release, we retire the Vocab class to `torchtext.legacy <https://github.com/pytorch/text/tree/main/torchtext/legacy>`_. Users can still access the legacy Vocab via ``torchtext.legacy.vocab``. This class has been replaced by a Vocab module that is backed by efficient C++ implementation and provides common functional APIs for NLP workflows.
157158

158159
Disclaimer on Datasets
159160
======================

examples/BERT/README.md

Lines changed: 0 additions & 143 deletions
This file was deleted.

0 commit comments

Comments
 (0)