You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With `include_usage=True`, LiteLLM requests report token and request counts through `result.context_wrapper.usage` just like the built-in OpenAI models.
Usage is aggregated across all model calls during the run (including tool calls and handoffs).
30
30
31
+
### Enabling usage with LiteLLM models
32
+
33
+
LiteLLM providers do not report usage metrics by default. When you are using [`LitellmModel`](models/litellm.md), pass `ModelSettings(include_usage=True)` to your agent so that LiteLLM responses populate `result.context_wrapper.usage`.
34
+
35
+
```python
36
+
from agents import Agent, ModelSettings, Runner
37
+
from agents.extensions.models.litellm_model import LitellmModel
result =await Runner.run(agent, "What's the weather in Tokyo?")
46
+
print(result.context_wrapper.usage.total_tokens)
47
+
```
48
+
31
49
## Accessing usage with sessions
32
50
33
51
When you use a `Session` (e.g., `SQLiteSession`), each call to `Runner.run(...)` returns usage for that specific run. Sessions maintain conversation history for context, but each run's usage is independent.
0 commit comments