This repository was archived by the owner on Sep 10, 2025. It is now read-only.
Make BERT benchmark code more robust #1871
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Updated benchmark code to run on pre-defined number of samples and batch size. By running on higher number of samples gives more robust statistics because 1) we show more variable length samples to tokenizer 2) we are running for larger number of batches instead of just 1 as currently the case.
Benchmark results
Number or samples: 100000
non-batched input
TorchText BERT Tokenizer: 1.7653241670000002
HF BERT Tokenizer (slow): 27.455106365
HF BERT Tokenizer (fast): 5.351107693000003
Batched input
Batch-size: 50
TorchText BERT Tokenizer: 1.376252063
HF BERT Tokenizer (fast): 1.5889374279999995
Batch-size: 100
TorchText BERT Tokenizer: 1.3049638119999996
HF BERT Tokenizer (fast): 1.4069846630000002
Batch-size: 200
TorchText BERT Tokenizer: 1.275028583
HF BERT Tokenizer (fast): 1.2769447180000002
Batch-size: 400
TorchText BERT Tokenizer: 1.3523340929999996
HF BERT Tokenizer (fast): 1.2558808729999997
Apparently, for HF tokenizer the operator call is slower compared to torchtext but the backend implementation is faster. This is why after increase batch size to certain point, HF tokenizer (fast) is more performant compared to torchtext's implementation.