This repository was archived by the owner on Sep 10, 2025. It is now read-only.

Description
🐛 Bug
from torchtext.vocab import build_vocab_from_iterator
special_symbols = ['<unk>', '<pad>', '<bos>', '<eos>']
def yield_tokens(data_iter: Iterable) -> List[str]:
for data_sample in data_iter:
yield token_transform(data_sample)
train_iter = IWSLT2017(split="train")
build_vocab_from_iterator(yield_tokens(train_iter), min_freq=1, specials=special_symbols, special_first=True)
When I run above code, it shows below error:
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://drive.google.com/uc?id=12ycYSzLIG253AFN35Y6qoyf9wtkOjakp
I think some download link for the data is broken.