torchtext.legacy.datasets.IWSLT is unusable due to outdated URL

## 🐛 Bug

The IWSLT dataset's URL was updated some time in late 2020, as mentioned in #1091. When torchtext v0.9.0 updated the IWSLT datasets to use the new dataset URL on Google Drive (see #1115), the corresponding `torchtext.legacy.datasets.IWSLT` dataset was not updated to the new URL.

Consequently, using `torchtext.legacy.datasets.IWSLT` causes torchtext to download an HTML page with a 404 message, instead of the actual dataset. This leads to the error: `OSError: Not a gzipped file`.

**To Reproduce**

Code
```python
from torchtext.legacy import data, datasets
f = data.Field()
datasets.IWSLT.splits(exts=('.de', '.en'), fields=(f, f))
```

Output
```
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/usr/lib/python3.7/tarfile.py in gzopen(cls, name, mode, fileobj, compresslevel, **kwargs)
   1645         try:
-> 1646             t = cls.taropen(name, mode, fileobj, **kwargs)
   1647         except OSError:

------⬍ 12 frames------
OSError: Not a gzipped file (b'<!')

During handling of the above exception, another exception occurred:

ReadError                                 Traceback (most recent call last)
/usr/lib/python3.7/tarfile.py in gzopen(cls, name, mode, fileobj, compresslevel, **kwargs)
   1648             fileobj.close()
   1649             if mode == 'r':
-> 1650                 raise ReadError("not a gzip file")
   1651             raise
   1652         except:

ReadError: not a gzip file
```

Note that switching to earlier versions of torchtext (e.g., v0.9 or v0.8) don't help, because that does not resolve the underlying 3rd-party URL issue.

## Environment Info
* torchtext v0.10.0
* System: tested on Google Colab and local machine, details unimportant

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

torchtext.legacy.datasets.IWSLT is unusable due to outdated URL #1357

🐛 Bug

Environment Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

torchtext.legacy.datasets.IWSLT is unusable due to outdated URL #1357

Description

🐛 Bug

Environment Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions