-
Notifications
You must be signed in to change notification settings - Fork 0
(feat): Improve bidi event loop #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Coordinates background tasks for model event processing, tool execution, and audio | ||
| handling while providing a simple interface for agent interactions. | ||
| class BidirectionalEventLoop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming is just a suggestion and open to discussion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Agent loop, not event loop
| # Synchronization primitives | ||
| self.interrupted = False | ||
| self.interruption_lock = asyncio.Lock() | ||
| self.conversation_lock = asyncio.Lock() # Race condition prevention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lock is used when adding to conversation history. Using a lock here to ensure the conversation history is not corrupted or overwritten. For example if multiple tools complete at the same time, then a lock will help ensure that all the results are written to the conversation history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does order of writing tool results matter? I guess tools will execute as they appear, and the model will continue to stream its response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically speaking we shouldn't need a lock for the following reasons:
- async tasks run one at a time but in a cooperative manner. But because they run one at a time, their is no concern of resource conflicts when operating in memory.
- The messages array is first updated in memory, which can only be altered by one task at a time.
- The session managers operate in hooks, which run synchronously. Assuming hooks could run asynchronously, I would say it is up to them to place in locks.
| """Start background tasks for model event processing and session supervision.""" | ||
| logger.debug("Starting bidirectional event loop") | ||
|
|
||
| self.background_tasks = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the self.background_tasks is initialized as an empty list in the constructor and populated in this method on purpose. This helps with error handling and also ensuring that the event loop instance is initialized before we create tasks.
| # Thread-safe counter increment | ||
| current_tool_number = self.tool_count + 1 | ||
| self.tool_count = current_tool_number | ||
| print(f"\nTool #{current_tool_number}: {tool_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to PrintingCallbackHandler in sdk-python/src/strands/handlers/callback_handler.py
| break | ||
|
|
||
| # Remove completed tasks from supervision list | ||
| tasks_to_supervise = [task for task in tasks_to_supervise if not task.done()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case the only task being supervised will be the _process_model_events task in normal connection/session and thus tasks_to_supervise list remains unchanged - no tasks removed. However, The removal logic exists as a safety net in case the model events task unexpectedly completes (network failure) and we need graceful shutdown.
| return event_loop | ||
|
|
||
|
|
||
| async def stop_bidirectional_connection(event_loop: "BidirectionalEventLoop") -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a wrapper that calls event_loop.stop(). Mainly to preserve existing api of agent.start_bidirectional_connection and agent.stop_bidirectional_connection in the agent class.
|
|
||
| # Cancel all tasks | ||
| for task in self.pending_tool_tasks.values(): | ||
| if not task.done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under what curcermentance, task will be done but still in this pending_tool_tasks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check here is checking if a task is not done and cancels it. So if a user triggers a multiple tool calls and then closes the bidirectional session, then we cancel all the pending tool tasks and clear them.
| # Thread-safe counter increment | ||
| current_tool_number = self.tool_count + 1 | ||
| self.tool_count = current_tool_number | ||
| print(f"\nTool #{current_tool_number}: {tool_name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to understand why we need print here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to replicate and ensure that a tool was being called/used by the model similar to how we use PintingCallbackHandler in existing strands system. Another similar comment here: #10 (comment)
| Coordinates background tasks for model event processing, tool execution, and audio | ||
| handling while providing a simple interface for agent interactions. | ||
| class BidirectionalEventLoop: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Agent loop, not event loop
| # Synchronization primitives | ||
| self.interrupted = False | ||
| self.interruption_lock = asyncio.Lock() | ||
| self.conversation_lock = asyncio.Lock() # Race condition prevention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does order of writing tool results matter? I guess tools will execute as they appear, and the model will continue to stream its response.
| } | ||
| while self.active and tasks_to_supervise: | ||
| # Wait for any task completion (deterministic vs polling) | ||
| done, pending = await asyncio.wait( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this supervisor? Can we design with with no supervision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could potentially use asyncio.gather().
Using asyncio.gather():
- Start all tasks with gather()
- If any task fails, gather() completes
- This is basically supervision built into asyncio
Although we could use asyncio.gather(), the current approach has light overhead and still provides a event based mechanism.
| while maintaining a simple interface for agent interaction. | ||
| Features: | ||
| - Concurrent task management for model events and tool execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_process_tool_execution I saw this function is removed but mentioned in doc.
|
|
||
| # Tool execution tracking | ||
| # Audio and metrics | ||
| self.audio_output_queue = asyncio.Queue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these bidirectional models can stream more than just audio data, I would avoid creating an audio specific queue. We should instead try to create a data agnostic queue.
| tool_name = tool_use.get("name") | ||
| tool_id = tool_use.get("toolUseId") | ||
|
|
||
| # Thread-safe counter increment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we using threads?
| tool_use_id = tool_result.get("toolUseId") | ||
| await self.model_session.send_tool_result(tool_use_id, tool_result) | ||
| logger.debug("Tool result sent: %s", tool_use_id) | ||
| elif isinstance(tool_event, ToolStreamEvent): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious how we deal with ToolStreamEvent
| finally: | ||
| logger.debug("Session supervisor stopped") | ||
|
|
||
| async def _execute_tool_with_strands(self, tool_use: ToolUse) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we just call this _execute_tool?
|
|
||
|
|
||
| # Session lifecycle coordinator functions | ||
| async def start_bidirectional_connection(agent: "BidirectionalAgent") -> "BidirectionalEventLoop": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Can we make this a @classmethod on BidirectionalEventLoop.
| return event_loop | ||
|
|
||
|
|
||
| async def stop_bidirectional_connection(event_loop: "BidirectionalEventLoop") -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Do we need this method? Can users just call event_loop.stop directly?
Bidirectional Event Loop Implementation Comparison
Overview
This document compares the original bidirectional event loop implementation (
bidirectional_event_loop.py) with a new class-based implementation architectural changes.Original Implementation Analysis
Architecture Overview
The original implementation combines a wrapper class (
BidirectionalConnection) with standalone functions for processing. The wrapper class coordinates multiple standalone processing functions that require session parameter passing.Core Components
BidirectionalConnection Class
background_tasks,pending_tool_taskstool_queuefor tool requests,audio_output_queuefor audio eventsinterruptedflag andinterruption_lockKey Function: Central coordinator object that holds all session state and manages communication between separate processing functions.
Coordinator Functions
start_bidirectional_connection(): Creates model session and initializes background tasksstop_bidirectional_connection(): Terminates session and cleans up resourcesbidirectional_event_loop_cycle(): Main supervision loop that monitors background tasksKey Function: Interface layer that creates and manages the session wrapper, serving as the public API for the event loop system.
Background Processing Functions
_process_model_events(): Handles incoming model events and forwards them appropriately_process_tool_execution(): Polls tool queue with 0.5-second timeout and executes tools_handle_interruption(): Cancels tool tasks and clears audio buffers_execute_tool_with_strands(): Integrates with Strands tool execution systemKey Function: Specialized processors that handle specific aspects of bidirectional streaming, each operating as independent functions requiring session parameter.
Task Management
The original implementation runs three primary background tasks:
Task Architecture: Three-task system where each background process handles a specific responsibility, requiring coordination through shared session state.
Tool Execution Flow
Tool execution follows a queue-based pattern:
tool_queuePerformance Impact: The queue polling introduces 0-500ms delay before tool execution begins, creating noticeable latency in real-time interactions.
Constants and Configuration
The implementation defines two key constants:
TOOL_QUEUE_TIMEOUT = 0.5: Maximum wait time for tool queue pollingSUPERVISION_INTERVAL = 0.1: Sleep interval for main coordination loopPerformance Constraint: These constants directly impact system responsiveness, with tool execution delayed by up to 500ms and task failures detected within 100ms.
New Implementation Analysis
Architecture Overview
The new implementation uses a class-based architecture where all functionality is encapsulated within the
BidirectionalEventLoopclass. The single class encapsulation eliminates parameter passing overhead and provides cleaner state management.Core Components
BidirectionalEventLoop Class
background_tasksandpending_tool_tasksinterruption_lock: Atomic interruption processingconversation_lock: Thread-safe conversation history updatesKey Advantage: All related functionality contained within single class, eliminating the need for parameter passing and providing cleaner encapsulation.
Class Methods
start(): Initializes background tasksstop(): Graceful shutdown and resource cleanupschedule_tool_execution(): Immediate tool task creationhandle_interruption(): Atomic interruption processing_process_model_events(): Event stream processing_supervise_session(): Background task health monitoring_execute_tool_with_strands(): Tool execution with Strands integrationMethod Design: Each method has single responsibility and operates on class state without requiring external parameter passing.
Coordinator Functions
start_bidirectional_connection(): Creates event loop instancestop_bidirectional_connection(): Delegates to event loop stop methodInterface Preservation: Same public API as original implementation.
Task Management
The new implementation runs two background tasks:
Task Consolidation: Reduces from 3 to 2 background tasks by combining event processing with immediate tool scheduling, eliminating the separate polling task.
Tool Execution Flow
Tool execution uses immediate scheduling:
schedule_tool_execution()creates asyncio task immediatelyPerformance Improvement: Eliminates queue polling entirely, reducing tool execution latency from 0-500ms to 0ms for real-time responsiveness.
Supervision Mechanism
The new implementation replaces polling-based supervision with event-driven monitoring using
asyncio.wait()withFIRST_COMPLETEDreturn condition and 1.0-second timeout for periodic active flag checks. This provides immediate task failure detection (0ms delay) versus the original 100ms polling intervals.Deterministic Behavior: Event-driven supervision responds immediately to task failures rather than waiting for next polling cycle, improving system reliability.
Component Comparison
Session Initialization
Original Approach
New Approach
Functional Equivalence: Both approaches create the necessary background tasks and initialize session state. The new approach eliminates parameter passing overhead.
Tool Execution Scheduling
Original Approach
New Approach
Functional Equivalence: Both approaches execute tools concurrently. The new approach eliminates polling delays, reducing tool execution latency from 0-500ms to 0ms.
Model Event Processing
Original Implementation
New Implementation
Functional Equivalence: Both approaches process the same event types and forward events to the agent. The new approach schedules tools immediately instead of queuing them.
Advantage: Immediate tool scheduling eliminates the queue bottleneck that was causing delays in tool execution.
Interruption Handling
Original Implementation
New Implementation
Functional Enhancement: The new implementation adds tool execution protection. When tools are actively running, audio is cleared for responsiveness but tools continue executing to completion.
Conversation History Management
Original Implementation
New Implementation
Functional Equivalence: Both implementations maintain conversation history. The new approach adds race condition protection for concurrent tool execution scenarios.
Concurrency Safety: Thread-safe operations prevent conversation history overwriting when multiple tools execute simultaneously.
Background Task Supervision
Original Implementation
New Implementation
Functional Equivalence: Both approaches monitor background task health and trigger cleanup on failures. The new Event-driven supervision provides deterministic failure detection instead of polling-based monitoring.
Tool Execution Integration
Original Implementation
New Implementation
Functional Equivalence: Both implementations use identical Strands tool execution pipelines with validation, event processing. The new approach adds thread-safe conversation history updates.
Integration Preserved: Maintains complete compatibility with existing Strands tool system while adding safety improvements.
Resource Cleanup
Original Implementation
New Implementation
Functional Equivalence: Both implementations cancel all active tasks and wait for completion. The new approach eliminates the separate main cycle task while maintaining the same cleanup.
Resource Management: Simplified cleanup process with fewer task types to manage, reducing complexity while maintaining complete resource cleanup.