Skip to content

Commit 5512a04

Browse files
authored
Update output_formatters.py to use gpt-4o's tokenizer
1 parent d36b3a0 commit 5512a04

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/gitingest/output_formatters.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ def _format_token_count(text: str) -> Optional[str]:
171171
The formatted number of tokens as a string (e.g., '1.2k', '1.2M'), or `None` if an error occurs.
172172
"""
173173
try:
174-
encoding = tiktoken.get_encoding("cl100k_base")
174+
encoding = tiktoken.get_encoding("o200k_base") # gpt-4o, gpt-4o-mini
175175
total_tokens = len(encoding.encode(text, disallowed_special=()))
176176
except (ValueError, UnicodeEncodeError) as exc:
177177
print(exc)

0 commit comments

Comments
 (0)