apply_func.py: from torchtext.legacy.data import Batch #6211

dbonner · 2021-02-26T05:21:04Z

What does this PR do?

The name Batch is no longer located under torchtext.data
Batch is now located under torchtext.legacy.data

--Error message--

File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 28, in <module>                                                      
    from torchtext.data import Batch                                                  
ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/python3.8/site-packages/torchtext/data/__init__.py)

You can fix this by changing line line 28 of pytorch_lightning/utilities/apply_func.py to:
from torchtext.legacy.data import Batch

Closes #6168
Closes #6165

Did you have fun?

Yes :)

codecov · 2021-02-26T05:22:52Z

Codecov Report

Merging #6211 (2e21e4c) into master (e7298b5) will decrease coverage by 2%.
The diff coverage is 75%.

@@           Coverage Diff           @@
##           master   #6211    +/-   ##
=======================================
- Coverage      93%     91%    -2%     
=======================================
  Files         159     159            
  Lines       11381   11384     +3     
=======================================
- Hits        10626   10362   -264     
- Misses        755    1022   +267

dbonner · 2021-02-26T06:18:49Z

After my third commit, there is the option to import torchtext.data or torchtext.legacy.data depending on the torchtext version.

pytorch_lightning/utilities/apply_func.py

dbonner · 2021-02-26T08:06:14Z

@rohitgr7 No problem. My fourth commit uses distutils.version.LooseVersion

Borda · 2021-02-26T10:11:04Z

@dbonnercould you pls return the checklist to the PR and check what is done/missing

dbonner · 2021-02-26T11:14:24Z

Hi @Borda
The failed checks are:

Docs check-make.docs
CircleCI-TPU Tests-build-Docs (counted as 2 errors)

The errors are the same (below).
I don't understand how they are related to my PR.
Also, the modules listed below do exist so I'm not sure why the build fails to find them:

Failed to import 'pytorch_lightning.profiler': no module named pytorch_lightning.profiler
Warning, treated as error:
[autosummary] failed to import 'pytorch_lightning.callbacks.BackboneFinetuning': no module named pytorch_lightning.callbacks.BackboneFinetuning
make: *** [Makefile:19: html] Error 2
Error: Process completed with exit code 2.

The CircleCI error ends with:
Makefile:19: recipe for target 'html' failed
make: *** [html] Error 2
Exited with code exit status 2
CircleCI received exit code 2

Please let me know if I should be changing any code to fix these errors.
Much appreciated,
Dan

rohitgr7 · 2021-02-26T11:54:36Z

@dbonner can you try rebasing you PR with origin/master??

The name Batch is no longer located under torchtext.data --Error message-- File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module> from torchtext.data import Batch ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/p ython3.8/site-packages/torchtext/data/__init__.py) You can fix this by changing line line 28 to: from torchtext.legacy.data import Batch

dbonner · 2021-02-26T12:44:45Z

Actually, when I try to import pytorch_lightning.callbacks.BackboneFinetuning
It falls. But I didn't think that was because of my patch.
ci/circle-ci build-docs:
failed to import 'pytorch_lightning.callbacks.BackboneFinetuning': no module named pytorch_lightning.callbacks.BackboneFinetuning
Makefile:19: recipe for target 'html' failed
make: *** [html] Error 2

dbonner · 2021-02-26T13:30:35Z

Needed to use:
if _module_available("torchtext.legacy.data"):

dbonner · 2021-02-26T13:47:33Z

Only CodeCov/project is failing now. I don't know why.

awaelchli

clean!

dbonner · 2021-02-26T14:31:23Z

Thanks everyone :)

Borda · 2021-02-26T15:34:52Z

Only CodeCov/project is failing now. I don't know why.

most likely you changes were in lines that have not been covered by tests...

rohitgr7 · 2021-02-26T15:39:23Z

Only CodeCov/project is failing now. I don't know why.

most likely you changes were in lines that have not been covered by tests...

Codecov has been failing a lot until recently even when the changes are covered by a test. Sometimes with the same commits if you rerun, it passes.

…6211) * Update apply_func.py The name Batch is no longer located under torchtext.data --Error message-- File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module> from torchtext.data import Batch ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/p ython3.8/site-packages/torchtext/data/__init__.py) You can fix this by changing line line 28 to: from torchtext.legacy.data import Batch * Update apply_func.py * Update apply_func.py * Update apply_func.py * Update apply_func.py * Update apply_func.py

as mentioned in Lightning-AI/pytorch-lightning#6211 and adobe#100

as mentioned in Lightning-AI/pytorch-lightning#6211 and #100

* Partial update * Bugfix * API update * Bugfixing and API * Bugfix * Fix long words OOM by skipping sentences * bugfixing and api update * Added language flavour * Added early stopping condition * Corrected naming * Corrected permissions * Bugfix * Added GPU support at runtime * Wrong config package * Refactoring * refactoring * add lightning to dependencies * Dummy test * Dummy test * Tweak * Tweak * Update test * Test * Finished loading for UD CONLL-U format * Working on tagger * Work on tagger * tagger training * tagger training * tagger training * Sync * Sync * Sync * Sync * Tagger working * Better weight for aux loss * Better weight for aux loss * Added save and printing for tagger and shared options class * Multilanguage evaluation * Saving multiple models * Updated ignore list * Added XLM-Roberta support * Using custom ro model * Score update * Bugfixing * Code refactor * Refactor * Added option to load external config * Added option to select LM-model from CLI or config * added option to overwrite config lm from CLI * Bugfix * Working on parser * Sync work on parser * Parser working * Removed load limit * Bugfix in evaluation * Added bi-affine attention * Added experimental ChuLiuEdmonds tree decoding * Better config for parser and bugfix * Added residuals to tagging * Model update * Switched to AdamW optimizer * Working on tokenizer * Working on tokenizer * Training working - validation to do * Bugfix in language id * Working on tokenization validation * Tokenizer working * YAML update * Bug in LMHelper * Tagger is working * Tokenizer is working * bfix * bfix * Bugfix for bugfix :) * Sync * Tokenizer worker * Tagger working * Trainer updates * Trainer process now working * Added .DS_Store * Added datasets for Compound Word Expander and Lemmatizer * Added collate function for lemma+compound * Added training and validation step * Updated config for Lemmatizer * Minor fixes * Removed duplicate entries from lemma and cwe * Added training support for lemmatizer * Removed debug directives * Lemmatizer in testing phase * removed unused line * Bugfix in Lemma dataset * Corrected validation issue with gs labels being sent to the forward method and removed loss computation during testing * Lemmatizier training done * Compound word expander ready * Sync * Added support for FastText, Transformers and Languasito LM models * Added multi-lm support for tokenizer * Added support for multiword tokens * Sync * Bugfix in evaluation * Added Languasito as a subpackage * Added path to local Languasito * Bugfixing all around * Removed debug printing * Bugfix for no-space languages that actually contain spaces :) * Bugfix for no-space languages that actually contain spaces :) * Fixed GPU support * Biaffine transform for LAS and relative head location (RHL) for UAS * Bugfix * Tweaks * moved rhl to lower layer * Added configurable option for RHL * Safenet for spaces in languages that should use no spaces * Better defaults * Sync * Cleanup parser * Bilinear xpos and attrs * Added Biaffine module from Stanza * Tagger with reduced number of parameters: * Parser with conditional attrs * Working on tokenizer runtime * Tokenizer process 90% done * Added runtime for parser, tokenizer and tagger * Added quick test for runtime * Test for e2e * Added support for multiple word embeddings at the same time * Bugfix * Added multiple word representations for tokenizer * moved mask_concat to utils.py * Added XPOS prediction to pipeline * Bugfix in tokenizer shifted word embeddings * Using Languasito tokenizer for HF tokenization * Bugfix * Bugfixing * Bugfixing * Bugfix * Runtime fixing * Sync * Added spa for FT and Languasito * Added spa for FT and Languasito * Minor tweaks * Added configuration for RNN layers * Bugfix for spa * HF runtime fix * Mixed test fasttext+transformer * Added word reconstruction and MHA * Sync * Bugfix * bugfix * Added masked attention * Sync * Added test for runtime * Bugfix in mask values * Updated test * Added full mask dropout * Added resume option * Removed useless printouts * Removed useless printouts * Switched to eval at runtime * multiprocessing added * Added full mask dropout for word decoder * Bugfix * Residual * Added lexical-contextual cosine loss * Removed full mask dropout from WordDecoder * Bugfix * Training script generation update * Added residual * Updated languasito to pickle tokenized lines * Updated languasito to pickle tokenized lines * Updated languasito to pickle tokenized lines * Not training for seq len > max_seq_len * Added seq limmits for collates * Passing seq limits from collate to tokenizer * Skipping complex parsing * Working on word decomposer * Model update * Sync * Bugfix * Bugfix * Bugfix * Using all reprs * Dropped immediate context * Multi train script added * Changed gpu parameter type to string, for multiple gpus int failed * Updated pytorch_lightning callback method to work with newer version * Updated pytorch_lightning callback method to work with newer version * Transparently pass PL args from the command line; skip over empty compound word datasets * Fix typo * Refactoring and on the way to working API * API load working * Partial _call_ working * Partial _call_ working * Added partly working api and refactored everything back to cube/. Compound not working yet and tokenizer needs retraining. * api is working * Fixing api * Updated readme * Update Readme to include flavours * Device support * api update * Updated package * Tweak + results * Clarification * Test update * Update * Sync * Update README * Bugfixing * Bugfix and api update * Fixed compound * Evaluation update * Bugfix * Package update * Bugfix for large sentences * Pip package update * Corrected spanish evaluation * Package version update * Fixed tokenization issues on transformers * Removed pinned memory * Bugfix for GPU tensors * Update package version * Automatically detecting hidden state size * Automatically detecting hidden state size * Automatically detecting hidden state size * Sync * Evaluation update * Package update * Bugfix * Bugfixing * Package version update * Bugfix * Package version update * Update evaluation for Italian * tentative support torchtext>=0.9.0 (#127) as mentioned in Lightning-AI/pytorch-lightning#6211 and #100 * Update package dependencies Co-authored-by: Stefan Dumitrescu <[email protected]> Co-authored-by: dumitrescustefan <[email protected]> Co-authored-by: Tiberiu Boros <[email protected]> Co-authored-by: Tiberiu Boros <[email protected]> Co-authored-by: Koichi Yasuoka <[email protected]>

* Corrected permissions * Bugfix * Added GPU support at runtime * Wrong config package * Refactoring * refactoring * add lightning to dependencies * Dummy test * Dummy test * Tweak * Tweak * Update test * Test * Finished loading for UD CONLL-U format * Working on tagger * Work on tagger * tagger training * tagger training * tagger training * Sync * Sync * Sync * Sync * Tagger working * Better weight for aux loss * Better weight for aux loss * Added save and printing for tagger and shared options class * Multilanguage evaluation * Saving multiple models * Updated ignore list * Added XLM-Roberta support * Using custom ro model * Score update * Bugfixing * Code refactor * Refactor * Added option to load external config * Added option to select LM-model from CLI or config * added option to overwrite config lm from CLI * Bugfix * Working on parser * Sync work on parser * Parser working * Removed load limit * Bugfix in evaluation * Added bi-affine attention * Added experimental ChuLiuEdmonds tree decoding * Better config for parser and bugfix * Added residuals to tagging * Model update * Switched to AdamW optimizer * Working on tokenizer * Working on tokenizer * Training working - validation to do * Bugfix in language id * Working on tokenization validation * Tokenizer working * YAML update * Bug in LMHelper * Tagger is working * Tokenizer is working * bfix * bfix * Bugfix for bugfix :) * Sync * Tokenizer worker * Tagger working * Trainer updates * Trainer process now working * Added .DS_Store * Added datasets for Compound Word Expander and Lemmatizer * Added collate function for lemma+compound * Added training and validation step * Updated config for Lemmatizer * Minor fixes * Removed duplicate entries from lemma and cwe * Added training support for lemmatizer * Removed debug directives * Lemmatizer in testing phase * removed unused line * Bugfix in Lemma dataset * Corrected validation issue with gs labels being sent to the forward method and removed loss computation during testing * Lemmatizier training done * Compound word expander ready * Sync * Added support for FastText, Transformers and Languasito LM models * Added multi-lm support for tokenizer * Added support for multiword tokens * Sync * Bugfix in evaluation * Added Languasito as a subpackage * Added path to local Languasito * Bugfixing all around * Removed debug printing * Bugfix for no-space languages that actually contain spaces :) * Bugfix for no-space languages that actually contain spaces :) * Fixed GPU support * Biaffine transform for LAS and relative head location (RHL) for UAS * Bugfix * Tweaks * moved rhl to lower layer * Added configurable option for RHL * Safenet for spaces in languages that should use no spaces * Better defaults * Sync * Cleanup parser * Bilinear xpos and attrs * Added Biaffine module from Stanza * Tagger with reduced number of parameters: * Parser with conditional attrs * Working on tokenizer runtime * Tokenizer process 90% done * Added runtime for parser, tokenizer and tagger * Added quick test for runtime * Test for e2e * Added support for multiple word embeddings at the same time * Bugfix * Added multiple word representations for tokenizer * moved mask_concat to utils.py * Added XPOS prediction to pipeline * Bugfix in tokenizer shifted word embeddings * Using Languasito tokenizer for HF tokenization * Bugfix * Bugfixing * Bugfixing * Bugfix * Runtime fixing * Sync * Added spa for FT and Languasito * Added spa for FT and Languasito * Minor tweaks * Added configuration for RNN layers * Bugfix for spa * HF runtime fix * Mixed test fasttext+transformer * Added word reconstruction and MHA * Sync * Bugfix * bugfix * Added masked attention * Sync * Added test for runtime * Bugfix in mask values * Updated test * Added full mask dropout * Added resume option * Removed useless printouts * Removed useless printouts * Switched to eval at runtime * multiprocessing added * Added full mask dropout for word decoder * Bugfix * Residual * Added lexical-contextual cosine loss * Removed full mask dropout from WordDecoder * Bugfix * Training script generation update * Added residual * Updated languasito to pickle tokenized lines * Updated languasito to pickle tokenized lines * Updated languasito to pickle tokenized lines * Not training for seq len > max_seq_len * Added seq limmits for collates * Passing seq limits from collate to tokenizer * Skipping complex parsing * Working on word decomposer * Model update * Sync * Bugfix * Bugfix * Bugfix * Using all reprs * Dropped immediate context * Multi train script added * Changed gpu parameter type to string, for multiple gpus int failed * Updated pytorch_lightning callback method to work with newer version * Updated pytorch_lightning callback method to work with newer version * Transparently pass PL args from the command line; skip over empty compound word datasets * Fix typo * Refactoring and on the way to working API * API load working * Partial _call_ working * Partial _call_ working * Added partly working api and refactored everything back to cube/. Compound not working yet and tokenizer needs retraining. * api is working * Fixing api * Updated readme * Update Readme to include flavours * Device support * api update * Updated package * Tweak + results * Clarification * Test update * Update * Sync * Update README * Bugfixing * Bugfix and api update * Fixed compound * Evaluation update * Bugfix * Package update * Bugfix for large sentences * Pip package update * Corrected spanish evaluation * Package version update * Fixed tokenization issues on transformers * Removed pinned memory * Bugfix for GPU tensors * Update package version * Automatically detecting hidden state size * Automatically detecting hidden state size * Automatically detecting hidden state size * Sync * Evaluation update * Package update * Bugfix * Bugfixing * Package version update * Bugfix * Package version update * Update evaluation for Italian * tentative support torchtext>=0.9.0 (#127) as mentioned in Lightning-AI/pytorch-lightning#6211 and #100 * Update package dependencies * Dummy word embeddings * Update params * Better dropout values * Skipping long words * Skipping long words * dummy we -> float * Added gradient clipping * Update tokenizer * Update tokenizer * Sync * DCWE * Working on DCWE --------- Co-authored-by: Stefan Dumitrescu <[email protected]> Co-authored-by: Tiberiu Boros <[email protected]> Co-authored-by: Koichi Yasuoka <[email protected]>

dbonner requested review from Borda, SeanNaren, carmocca and tchaton as code owners February 26, 2021 05:21

dbonner mentioned this pull request Feb 26, 2021

apply_func.py: from torchtext.legacy.data import Batch #6210

Closed

rohitgr7 reviewed Feb 26, 2021

View reviewed changes

pytorch_lightning/utilities/apply_func.py Outdated Show resolved Hide resolved

rohitgr7 linked an issue Feb 26, 2021 that may be closed by this pull request

apply_func.py: from torchtext.legacy.data import Batch #6210

Closed

rohitgr7 approved these changes Feb 26, 2021

View reviewed changes

Borda mentioned this pull request Feb 26, 2021

Temporary fix to be BC compatible with torchtext #6168

Closed

Borda approved these changes Feb 26, 2021

View reviewed changes

Borda added bug Something isn't working ready PRs ready to be merged labels Feb 26, 2021

SkafteNicki approved these changes Feb 26, 2021

View reviewed changes

dbonner added 4 commits February 26, 2021 23:33

Update apply_func.py

8416a6c

Update apply_func.py

52ebcec

Update apply_func.py

67e838b

dbonner added 2 commits February 27, 2021 00:10

Update apply_func.py

05ec7c7

Update apply_func.py

58b5a59

awaelchli approved these changes Feb 26, 2021

View reviewed changes

rohitgr7 merged commit ee5032a into Lightning-AI:master Feb 26, 2021

tchaton added this to the 1.2.x milestone Mar 2, 2021

Borda mentioned this pull request Mar 2, 2021

try unfreeze torchtext #6302

Merged

11 tasks

KoichiYasuoka added a commit to KoichiYasuoka/NLP-Cube that referenced this pull request Aug 12, 2021

tentative support torchtext>=0.9.0

308d0f5

as mentioned in Lightning-AI/pytorch-lightning#6211 and adobe#100

KoichiYasuoka mentioned this pull request Aug 12, 2021

tentative support torchtext>=0.9.0 adobe/NLP-Cube#127

Merged

tiberiu44 pushed a commit to adobe/NLP-Cube that referenced this pull request Aug 15, 2021

tentative support torchtext>=0.9.0 (#127)

b561d7f

as mentioned in Lightning-AI/pytorch-lightning#6211 and #100

apply_func.py: from torchtext.legacy.data import Batch #6211

apply_func.py: from torchtext.legacy.data import Batch #6211

Uh oh!

Conversation

dbonner commented Feb 26, 2021 • edited by carmocca Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Did you have fun?

Uh oh!

codecov bot commented Feb 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

Borda commented Feb 26, 2021

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

rohitgr7 commented Feb 26, 2021

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

awaelchli left a comment

Choose a reason for hiding this comment

Uh oh!

dbonner commented Feb 26, 2021

Uh oh!

Borda commented Feb 26, 2021

Uh oh!

rohitgr7 commented Feb 26, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dbonner commented Feb 26, 2021 •

edited by carmocca

Loading

codecov bot commented Feb 26, 2021 •

edited

Loading