Skip to content

[FEATURE] Bidirectional Streaming #217

@pgrayy

Description

@pgrayy

Overview

Bidirectional streaming enables real-time, continuous communication between clients and AI models in both directions simultaneously. Unlike traditional request-response patterns, this approach allows for simultaneous data exchange where both client and model can send and receive data incrementally. This creates a more natural interaction flow where content is processed as it arrives, without waiting for complete messages, and conversations can adapt dynamically based on ongoing inputs and outputs.

Model Support

Several providers are building models with bidirectional streaming capabilities. Some examples include:

  • Amazon: Amazon's Nova Sonic model offers real-time speech processing, interruption handling, context-aware responses, and low latency interactions, making it particularly effective for voice assistants (docs).
  • OpenAI: OpenAI also provides models with bidirectional streaming capabilities, allowing for dynamic conversation flows where the model can receive new information while generating a response (announcement).

Request

Support a bidirectional streaming interface in Strands.

Prototype

To help facilitate discussion, we have implemented a prototype for bidirectional streaming under https://github.com/pgrayy/strands-sdk-python-async (see README for instructions on testing). The prototype implements a flexible architecture designed to handle real-time, two-way communication between clients and AI models. The implementation focuses on supporting audio-based interactions with Nova Sonic while establishing patterns that could extend to other models and modalities. The key components are:

  • Bidirectional Agent: The Agent class in the bidirectional module provides an async context manager for sending data (send), an async generator for receiving data (receive), and a method to initialize bidirectional streaming (bistream). For example usage, please see https://github.com/pgrayy/sdk-python-async/blob/main/scripts/agents/bidirectional.py.
  • Model Sender/Receiver: The abstract Sender and Receiver interfaces define the contract for model providers. The Sender handles outgoing events to the model with context managers for different content types (text, audio, tools), while the Receiver processes incoming events from the model and constructs message history.
  • Event System: Events are structured as typed objects representing different kinds of streaming content, including session events (start/end), prompt events (start/end), and content events (text, audio, system, tool).
  • Nova Implementation: The Nova implementation demonstrates how to adapt a specific model to the bidirectional interface, providing a concrete example of the architecture in action.

Metadata

Metadata

Assignees

Labels

area-asyncRelated to asynchronous flows or multi-threadingarea-bidirectional-streamingRelated to bidirectional streamingenhancementNew feature or requestto-refineIssue needs to be discussed with the team and the team has come to an effort estimate consensus

Projects

Status

We're Working On It

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions