Skip to content

Conversation

@dbonner
Copy link
Contributor

@dbonner dbonner commented Feb 26, 2021

What does this PR do?

The name Batch is no longer located under torchtext.data
Batch is now located under torchtext.legacy.data

--Error message--

File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 28, in <module>                                                      
    from torchtext.data import Batch                                                  
ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/python3.8/site-packages/torchtext/data/__init__.py)

You can fix this by changing line line 28 of pytorch_lightning/utilities/apply_func.py to:
from torchtext.legacy.data import Batch

Closes #6168
Closes #6165

Did you have fun?

Yes :)

@codecov
Copy link

codecov bot commented Feb 26, 2021

Codecov Report

Merging #6211 (2e21e4c) into master (e7298b5) will decrease coverage by 2%.
The diff coverage is 75%.

@@           Coverage Diff           @@
##           master   #6211    +/-   ##
=======================================
- Coverage      93%     91%    -2%     
=======================================
  Files         159     159            
  Lines       11381   11384     +3     
=======================================
- Hits        10626   10362   -264     
- Misses        755    1022   +267     

@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

After my third commit, there is the option to import torchtext.data or torchtext.legacy.data depending on the torchtext version.

@rohitgr7 rohitgr7 linked an issue Feb 26, 2021 that may be closed by this pull request
@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

@rohitgr7 No problem. My fourth commit uses distutils.version.LooseVersion

@Borda Borda added bug Something isn't working ready PRs ready to be merged labels Feb 26, 2021
@Borda
Copy link
Collaborator

Borda commented Feb 26, 2021

@dbonnercould you pls return the checklist to the PR and check what is done/missing

@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

Hi @Borda
The failed checks are:

  1. Docs check-make.docs
  2. CircleCI-TPU Tests-build-Docs (counted as 2 errors)

The errors are the same (below).
I don't understand how they are related to my PR.
Also, the modules listed below do exist so I'm not sure why the build fails to find them:

Failed to import 'pytorch_lightning.profiler': no module named pytorch_lightning.profiler
Warning, treated as error:
[autosummary] failed to import 'pytorch_lightning.callbacks.BackboneFinetuning': no module named pytorch_lightning.callbacks.BackboneFinetuning
make: *** [Makefile:19: html] Error 2
Error: Process completed with exit code 2.

The CircleCI error ends with:
Makefile:19: recipe for target 'html' failed
make: *** [html] Error 2
Exited with code exit status 2
CircleCI received exit code 2

Please let me know if I should be changing any code to fix these errors.
Much appreciated,
Dan

@rohitgr7
Copy link
Contributor

@dbonner can you try rebasing you PR with origin/master??

The name Batch is no longer located under torchtext.data
--Error message--
File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>                                                      
    from torchtext.data import Batch                                                  
ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/p
ython3.8/site-packages/torchtext/data/__init__.py)
You can fix this by changing line line 28 to:
    from torchtext.legacy.data import Batch
@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

Actually, when I try to import pytorch_lightning.callbacks.BackboneFinetuning
It falls. But I didn't think that was because of my patch.
ci/circle-ci build-docs:
failed to import 'pytorch_lightning.callbacks.BackboneFinetuning': no module named pytorch_lightning.callbacks.BackboneFinetuning
Makefile:19: recipe for target 'html' failed
make: *** [html] Error 2

@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

Needed to use:
if _module_available("torchtext.legacy.data"):

@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

Only CodeCov/project is failing now. I don't know why.

Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean!

@rohitgr7 rohitgr7 merged commit ee5032a into Lightning-AI:master Feb 26, 2021
@dbonner
Copy link
Contributor Author

dbonner commented Feb 26, 2021

Thanks everyone :)

@Borda
Copy link
Collaborator

Borda commented Feb 26, 2021

Only CodeCov/project is failing now. I don't know why.

most likely you changes were in lines that have not been covered by tests...

@rohitgr7
Copy link
Contributor

Only CodeCov/project is failing now. I don't know why.

most likely you changes were in lines that have not been covered by tests...

Codecov has been failing a lot until recently even when the changes are covered by a test. Sometimes with the same commits if you rerun, it passes.

@tchaton tchaton added this to the 1.2.x milestone Mar 2, 2021
kaushikb11 pushed a commit to kaushikb11/pytorch-lightning that referenced this pull request Mar 2, 2021
…6211)

* Update apply_func.py

The name Batch is no longer located under torchtext.data
--Error message--
File "/home/daniel/py38/lib/python3.8/site-packages/pytorch_lightning/utilities/apply_func.py", line 25, in <module>                                                      
    from torchtext.data import Batch                                                  
ImportError: cannot import name 'Batch' from 'torchtext.data' (/home/daniel/py38/lib/p
ython3.8/site-packages/torchtext/data/__init__.py)
You can fix this by changing line line 28 to:
    from torchtext.legacy.data import Batch

* Update apply_func.py

* Update apply_func.py

* Update apply_func.py

* Update apply_func.py

* Update apply_func.py
@Borda Borda mentioned this pull request Mar 2, 2021
11 tasks
KoichiYasuoka added a commit to KoichiYasuoka/NLP-Cube that referenced this pull request Aug 12, 2021
tiberiu44 pushed a commit to adobe/NLP-Cube that referenced this pull request Aug 15, 2021
tiberiu44 added a commit to adobe/NLP-Cube that referenced this pull request Aug 27, 2021
* Partial update

* Bugfix

* API update

* Bugfixing and API

* Bugfix

* Fix long words OOM by skipping sentences

* bugfixing and api update

* Added language flavour

* Added early stopping condition

* Corrected naming

* Corrected permissions

* Bugfix

* Added GPU support at runtime

* Wrong config package

* Refactoring

* refactoring

* add lightning to dependencies

* Dummy test

* Dummy test

* Tweak

* Tweak

* Update test

* Test

* Finished loading for UD CONLL-U format

* Working on tagger

* Work on tagger

* tagger training

* tagger training

* tagger training

* Sync

* Sync

* Sync

* Sync

* Tagger working

* Better weight for aux loss

* Better weight for aux loss

* Added save and printing for tagger and shared options class

* Multilanguage evaluation

* Saving multiple models

* Updated ignore list

* Added XLM-Roberta support

* Using custom ro model

* Score update

* Bugfixing

* Code refactor

* Refactor

* Added option to load external config

* Added option to select LM-model from CLI or config

* added option to overwrite config lm from CLI

* Bugfix

* Working on parser

* Sync work on parser

* Parser working

* Removed load limit

* Bugfix in evaluation

* Added bi-affine attention

* Added experimental ChuLiuEdmonds tree decoding

* Better config for parser and bugfix

* Added residuals to tagging

* Model update

* Switched to AdamW optimizer

* Working on tokenizer

* Working on tokenizer

* Training working - validation to do

* Bugfix in language id

* Working on tokenization validation

* Tokenizer working

* YAML update

* Bug in LMHelper

* Tagger is working

* Tokenizer is working

* bfix

* bfix

* Bugfix for bugfix :)

* Sync

* Tokenizer worker

* Tagger working

* Trainer updates

* Trainer process now working

* Added .DS_Store

* Added datasets for Compound Word Expander and Lemmatizer

* Added collate function for lemma+compound

* Added training and validation step

* Updated config for Lemmatizer

* Minor fixes

* Removed duplicate entries from lemma and cwe

* Added training support for lemmatizer

* Removed debug directives

* Lemmatizer in testing phase

* removed unused line

* Bugfix in Lemma dataset

* Corrected validation issue with gs labels being sent to the forward method and removed loss computation during testing

* Lemmatizier training done

* Compound word expander ready

* Sync

* Added support for FastText, Transformers and Languasito LM models

* Added multi-lm support for tokenizer

* Added support for multiword tokens

* Sync

* Bugfix in evaluation

* Added Languasito as a subpackage

* Added path to local Languasito

* Bugfixing all around

* Removed debug printing

* Bugfix for no-space languages that actually contain spaces :)

* Bugfix for no-space languages that actually contain spaces :)

* Fixed GPU support

* Biaffine transform for LAS and relative head location (RHL) for UAS

* Bugfix

* Tweaks

* moved rhl to lower layer

* Added configurable option for RHL

* Safenet for spaces in languages that should use no spaces

* Better defaults

* Sync

* Cleanup parser

* Bilinear xpos and attrs

* Added Biaffine module from Stanza

* Tagger with reduced number of parameters:

* Parser with conditional attrs

* Working on tokenizer runtime

* Tokenizer process 90% done

* Added runtime for parser, tokenizer and tagger

* Added quick test for runtime

* Test for e2e

* Added support for multiple word embeddings at the same time

* Bugfix

* Added multiple word representations for tokenizer

* moved mask_concat to utils.py

* Added XPOS prediction to pipeline

* Bugfix in tokenizer shifted word embeddings

* Using Languasito tokenizer for HF tokenization

* Bugfix

* Bugfixing

* Bugfixing

* Bugfix

* Runtime fixing

* Sync

* Added spa for FT and Languasito

* Added spa for FT and Languasito

* Minor tweaks

* Added configuration for RNN layers

* Bugfix for spa

* HF runtime fix

* Mixed test fasttext+transformer

* Added word reconstruction and MHA

* Sync

* Bugfix

* bugfix

* Added masked attention

* Sync

* Added test for runtime

* Bugfix in mask values

* Updated test

* Added full mask dropout

* Added resume option

* Removed useless printouts

* Removed useless printouts

* Switched to eval at runtime

* multiprocessing added

* Added full mask dropout for word decoder

* Bugfix

* Residual

* Added lexical-contextual cosine loss

* Removed full mask dropout from WordDecoder

* Bugfix

* Training script generation update

* Added residual

* Updated languasito to pickle tokenized lines

* Updated languasito to pickle tokenized lines

* Updated languasito to pickle tokenized lines

* Not training for seq len > max_seq_len

* Added seq limmits for collates

* Passing seq limits from collate to tokenizer

* Skipping complex parsing

* Working on word decomposer

* Model update

* Sync

* Bugfix

* Bugfix

* Bugfix

* Using all reprs

* Dropped immediate context

* Multi train script added

* Changed gpu parameter type to string, for multiple gpus int failed

* Updated pytorch_lightning callback method to work with newer version

* Updated pytorch_lightning callback method to work with newer version

* Transparently pass PL args from the command line; skip over empty compound word datasets

* Fix typo

* Refactoring and on the way to working API

* API load working

* Partial _call_ working

* Partial _call_ working

* Added partly working api and refactored everything back to cube/. Compound not working yet and tokenizer needs retraining.

* api is working

* Fixing api

* Updated readme

* Update Readme to include flavours

* Device support

* api update

* Updated package

* Tweak + results

* Clarification

* Test update

* Update

* Sync

* Update README

* Bugfixing

* Bugfix and api update

* Fixed compound

* Evaluation update

* Bugfix

* Package update

* Bugfix for large sentences

* Pip package update

* Corrected spanish evaluation

* Package version update

* Fixed tokenization issues on transformers

* Removed pinned memory

* Bugfix for GPU tensors

* Update package version

* Automatically detecting hidden state size

* Automatically detecting hidden state size

* Automatically detecting hidden state size

* Sync

* Evaluation update

* Package update

* Bugfix

* Bugfixing

* Package version update

* Bugfix

* Package version update

* Update evaluation for Italian

* tentative support torchtext>=0.9.0 (#127)

as mentioned in Lightning-AI/pytorch-lightning#6211 and #100

* Update package dependencies

Co-authored-by: Stefan Dumitrescu <[email protected]>
Co-authored-by: dumitrescustefan <[email protected]>
Co-authored-by: Tiberiu Boros <[email protected]>
Co-authored-by: Tiberiu Boros <[email protected]>
Co-authored-by: Koichi Yasuoka <[email protected]>
tiberiu44 added a commit to adobe/NLP-Cube that referenced this pull request Feb 17, 2023
* Corrected permissions

* Bugfix

* Added GPU support at runtime

* Wrong config package

* Refactoring

* refactoring

* add lightning to dependencies

* Dummy test

* Dummy test

* Tweak

* Tweak

* Update test

* Test

* Finished loading for UD CONLL-U format

* Working on tagger

* Work on tagger

* tagger training

* tagger training

* tagger training

* Sync

* Sync

* Sync

* Sync

* Tagger working

* Better weight for aux loss

* Better weight for aux loss

* Added save and printing for tagger and shared options class

* Multilanguage evaluation

* Saving multiple models

* Updated ignore list

* Added XLM-Roberta support

* Using custom ro model

* Score update

* Bugfixing

* Code refactor

* Refactor

* Added option to load external config

* Added option to select LM-model from CLI or config

* added option to overwrite config lm from CLI

* Bugfix

* Working on parser

* Sync work on parser

* Parser working

* Removed load limit

* Bugfix in evaluation

* Added bi-affine attention

* Added experimental ChuLiuEdmonds tree decoding

* Better config for parser and bugfix

* Added residuals to tagging

* Model update

* Switched to AdamW optimizer

* Working on tokenizer

* Working on tokenizer

* Training working - validation to do

* Bugfix in language id

* Working on tokenization validation

* Tokenizer working

* YAML update

* Bug in LMHelper

* Tagger is working

* Tokenizer is working

* bfix

* bfix

* Bugfix for bugfix :)

* Sync

* Tokenizer worker

* Tagger working

* Trainer updates

* Trainer process now working

* Added .DS_Store

* Added datasets for Compound Word Expander and Lemmatizer

* Added collate function for lemma+compound

* Added training and validation step

* Updated config for Lemmatizer

* Minor fixes

* Removed duplicate entries from lemma and cwe

* Added training support for lemmatizer

* Removed debug directives

* Lemmatizer in testing phase

* removed unused line

* Bugfix in Lemma dataset

* Corrected validation issue with gs labels being sent to the forward method and removed loss computation during testing

* Lemmatizier training done

* Compound word expander ready

* Sync

* Added support for FastText, Transformers and Languasito LM models

* Added multi-lm support for tokenizer

* Added support for multiword tokens

* Sync

* Bugfix in evaluation

* Added Languasito as a subpackage

* Added path to local Languasito

* Bugfixing all around

* Removed debug printing

* Bugfix for no-space languages that actually contain spaces :)

* Bugfix for no-space languages that actually contain spaces :)

* Fixed GPU support

* Biaffine transform for LAS and relative head location (RHL) for UAS

* Bugfix

* Tweaks

* moved rhl to lower layer

* Added configurable option for RHL

* Safenet for spaces in languages that should use no spaces

* Better defaults

* Sync

* Cleanup parser

* Bilinear xpos and attrs

* Added Biaffine module from Stanza

* Tagger with reduced number of parameters:

* Parser with conditional attrs

* Working on tokenizer runtime

* Tokenizer process 90% done

* Added runtime for parser, tokenizer and tagger

* Added quick test for runtime

* Test for e2e

* Added support for multiple word embeddings at the same time

* Bugfix

* Added multiple word representations for tokenizer

* moved mask_concat to utils.py

* Added XPOS prediction to pipeline

* Bugfix in tokenizer shifted word embeddings

* Using Languasito tokenizer for HF tokenization

* Bugfix

* Bugfixing

* Bugfixing

* Bugfix

* Runtime fixing

* Sync

* Added spa for FT and Languasito

* Added spa for FT and Languasito

* Minor tweaks

* Added configuration for RNN layers

* Bugfix for spa

* HF runtime fix

* Mixed test fasttext+transformer

* Added word reconstruction and MHA

* Sync

* Bugfix

* bugfix

* Added masked attention

* Sync

* Added test for runtime

* Bugfix in mask values

* Updated test

* Added full mask dropout

* Added resume option

* Removed useless printouts

* Removed useless printouts

* Switched to eval at runtime

* multiprocessing added

* Added full mask dropout for word decoder

* Bugfix

* Residual

* Added lexical-contextual cosine loss

* Removed full mask dropout from WordDecoder

* Bugfix

* Training script generation update

* Added residual

* Updated languasito to pickle tokenized lines

* Updated languasito to pickle tokenized lines

* Updated languasito to pickle tokenized lines

* Not training for seq len > max_seq_len

* Added seq limmits for collates

* Passing seq limits from collate to tokenizer

* Skipping complex parsing

* Working on word decomposer

* Model update

* Sync

* Bugfix

* Bugfix

* Bugfix

* Using all reprs

* Dropped immediate context

* Multi train script added

* Changed gpu parameter type to string, for multiple gpus int failed

* Updated pytorch_lightning callback method to work with newer version

* Updated pytorch_lightning callback method to work with newer version

* Transparently pass PL args from the command line; skip over empty compound word datasets

* Fix typo

* Refactoring and on the way to working API

* API load working

* Partial _call_ working

* Partial _call_ working

* Added partly working api and refactored everything back to cube/. Compound not working yet and tokenizer needs retraining.

* api is working

* Fixing api

* Updated readme

* Update Readme to include flavours

* Device support

* api update

* Updated package

* Tweak + results

* Clarification

* Test update

* Update

* Sync

* Update README

* Bugfixing

* Bugfix and api update

* Fixed compound

* Evaluation update

* Bugfix

* Package update

* Bugfix for large sentences

* Pip package update

* Corrected spanish evaluation

* Package version update

* Fixed tokenization issues on transformers

* Removed pinned memory

* Bugfix for GPU tensors

* Update package version

* Automatically detecting hidden state size

* Automatically detecting hidden state size

* Automatically detecting hidden state size

* Sync

* Evaluation update

* Package update

* Bugfix

* Bugfixing

* Package version update

* Bugfix

* Package version update

* Update evaluation for Italian

* tentative support torchtext>=0.9.0 (#127)

as mentioned in Lightning-AI/pytorch-lightning#6211 and #100

* Update package dependencies

* Dummy word embeddings

* Update params

* Better dropout values

* Skipping long words

* Skipping long words

* dummy we -> float

* Added gradient clipping

* Update tokenizer

* Update tokenizer

* Sync

* DCWE

* Working on DCWE

---------

Co-authored-by: Stefan Dumitrescu <[email protected]>
Co-authored-by: Tiberiu Boros <[email protected]>
Co-authored-by: Koichi Yasuoka <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready PRs ready to be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apply_func.py: from torchtext.legacy.data import Batch Newer torchtext (>= nightly 2021-02-19) breaks PyTorch Lightning

6 participants