[BUG] SummarizingContexManager for OpenAI Models doesn't works

### Checks

- [x] I have updated to the lastest minor and patch version of Strands
- [x] I have checked the documentation and this is not expected behavior
- [x] I have searched [./issues](./issues?q=) and there are no duplicates of my issue

### Strands Version

1.7.1

### Python Version

3.13.7

### Operating System

MacOS 

### Installation Method

pip

### Steps to Reproduce

    openai_model = OpenAIModel(
        client_args={
            "api_key": "*************************************************************",
            "base_url": "https://api.groq.com/openai/v1",
            "_strict_response_validation": False
        },
        model_id="moonshotai/kimi-k2-instruct-0905",
        params={
            "temperature": 0.7,
            "max_tokens": 8192
        }
    )
    
    # Create conversation manager with summarizing strategy
    # This will summarize older messages to prevent context length exceeded errors
    conversation_manager = SummarizingConversationManager(
        summary_ratio=0.3,
        preserve_recent_messages=3,
        summarization_agent=openai_model
    )
    
    # Create agent with OpenAI model, Exa tools, and conversation manager
    agent = Agent(
        model=openai_model,
        tools=[exa_search_and_contents],  # Using custom tool instead of separate exa_search and exa_get_contents
        system_prompt=system_prompt,
        conversation_manager=conversation_manager
    )

### Expected Behavior

When the model does not respond because of context length issue this context manager is supposed to make context smaller by summarising it.



### Actual Behavior

Some platforms like Groq have TPM limit. If message has become too long it will consistently exceed TPM, so every time you will get exception. Trying after some time does not help. In such cases SummarizingConversationManager should trigger.

However, SummarizingConversationManager does not handle it because it looks for ContextOverFlow exceptions, this is not a ContextOverFlow issue, but TPM exceeded issue.

ERROR:backend_strands_multi_agent_with_groq_custom_tool_updated:Error in multi-agent research workflow: Error code: 413 - {'error': {'message': 'Request too large for model `moonshotai/kimi-k2-instruct-0905` in organization `org_01hvwykqttfc1brn6gydxkxgre` service tier `on_demand` on tokens per minute (TPM): Limit 250000, Requested 252062, please reduce your message size and try again. Need more tokens? Visit https://groq.com/self-serve-support/ to request higher limits.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}


### Additional Context

I also compared anthropic.py, bedrock.py and openai.py files of the SDK.

The first two indeed raise ContextWindowOverflowException when context is too long. However, OpenAI.py does not raise ContextWindowOverflowException under any circumstance. So that file also needs to be fixed.

### Possible Solution

1. Fix OpenAI.py to raise - ContextWindowOverflowException 
2. Also for all the models, raise the same/similar exception when the Token Per Minute (TPM) is exceeded that SummarizingContextManager should handle (SummarizingContextManager.py file)

### Related Issues

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] SummarizingContexManager for OpenAI Models doesn't works #860

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] SummarizingContexManager for OpenAI Models doesn't works #860

Description

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions