-
Notifications
You must be signed in to change notification settings - Fork 446
Description
Checks
- I have updated to the lastest minor and patch version of Strands
- I have checked the documentation and this is not expected behavior
- I have searched ./issues and there are no duplicates of my issue
Strands Version
1.7.1
Python Version
3.13.7
Operating System
MacOS
Installation Method
pip
Steps to Reproduce
openai_model = OpenAIModel(
client_args={
"api_key": "*************************************************************",
"base_url": "https://api.groq.com/openai/v1",
"_strict_response_validation": False
},
model_id="moonshotai/kimi-k2-instruct-0905",
params={
"temperature": 0.7,
"max_tokens": 8192
}
)
# Create conversation manager with summarizing strategy
# This will summarize older messages to prevent context length exceeded errors
conversation_manager = SummarizingConversationManager(
summary_ratio=0.3,
preserve_recent_messages=3,
summarization_agent=openai_model
)
# Create agent with OpenAI model, Exa tools, and conversation manager
agent = Agent(
model=openai_model,
tools=[exa_search_and_contents], # Using custom tool instead of separate exa_search and exa_get_contents
system_prompt=system_prompt,
conversation_manager=conversation_manager
)
Expected Behavior
When the model does not respond because of context length issue this context manager is supposed to make context smaller by summarising it.
Actual Behavior
Some platforms like Groq have TPM limit. If message has become too long it will consistently exceed TPM, so every time you will get exception. Trying after some time does not help. In such cases SummarizingConversationManager should trigger.
However, SummarizingConversationManager does not handle it because it looks for ContextOverFlow exceptions, this is not a ContextOverFlow issue, but TPM exceeded issue.
ERROR:backend_strands_multi_agent_with_groq_custom_tool_updated:Error in multi-agent research workflow: Error code: 413 - {'error': {'message': 'Request too large for model moonshotai/kimi-k2-instruct-0905
in organization org_01hvwykqttfc1brn6gydxkxgre
service tier on_demand
on tokens per minute (TPM): Limit 250000, Requested 252062, please reduce your message size and try again. Need more tokens? Visit https://groq.com/self-serve-support/ to request higher limits.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Additional Context
I also compared anthropic.py, bedrock.py and openai.py files of the SDK.
The first two indeed raise ContextWindowOverflowException when context is too long. However, OpenAI.py does not raise ContextWindowOverflowException under any circumstance. So that file also needs to be fixed.
Possible Solution
- Fix OpenAI.py to raise - ContextWindowOverflowException
- Also for all the models, raise the same/similar exception when the Token Per Minute (TPM) is exceeded that SummarizingContextManager should handle (SummarizingContextManager.py file)
Related Issues
No response