⚡️ Speed up function encoded_tokens_len by 39% in PR #231 (remove-tiktoken)
#236
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #231
If you approve this dependent PR, these changes will be merged into the original PR branch
remove-tiktoken.📄 39% (0.39x) speedup for
encoded_tokens_lenincodeflash/code_utils/code_utils.py⏱️ Runtime :
42.8 microseconds→30.8 microseconds(best of277runs)⚡️ This change will improve the performance of the following benchmarks:
📝 Explanation and details
Here is an optimized version of your code.
The multiplication and conversion to int are very fast, but calling
len()on a Python string first computes the length.To minimize overhead, we can use integer arithmetic to avoid the float operations in
len(s)*0.3. Multiplying by 0.3 is equivalent to multiplying by 3 and integer dividing by 10.Here's the optimized code.
This avoids floating point multiplication and
int()casting, and is slightly faster.All comments and signatures are preserved.
✅ Correctness verification report:
🌀 Generated Regression Tests Details
To edit these changes
git checkout codeflash/optimize-pr231-2025-05-21T01.49.34and push.