-
Notifications
You must be signed in to change notification settings - Fork 0
(feat)bidirectional_streaming: add openai realtime model provider #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/strands/experimental/bidirectional_streaming/models/novasonic.py
Outdated
Show resolved
Hide resolved
| "type": "server_vad", | ||
| "threshold": 0.5, | ||
| "prefix_padding_ms": 300, | ||
| "silence_duration_ms": 500, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/strands/experimental/bidirectional_streaming/models/openai.py
Outdated
Show resolved
Hide resolved
src/strands/experimental/bidirectional_streaming/models/openai.py
Outdated
Show resolved
Hide resolved
| system_prompt: str | None = None, | ||
| tools: list[ToolSpec] | None = None, | ||
| messages: Messages | None = None, | ||
| **kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This **kwargs is now unsed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, **kwargs in create_bidirectional_connection should stay since it's part of the abstract interface and may be used by other implementations or future extensions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 all abstract methods should have **kwargs, that will allow us to extend these methods with more inputs later on
src/strands/experimental/bidirectional_streaming/models/novasonic.py
Outdated
Show resolved
Hide resolved
src/strands/experimental/bidirectional_streaming/types/bidirectional_streaming.py
Show resolved
Hide resolved
|
|
||
| await agent.send(audio_event) | ||
|
|
||
| except asyncio.TimeoutError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could discuss tomorrow when we want to fail hard, when we want to handle it silently
|
|
||
| logger.debug("OpenAI Realtime session initialized: %s", self.session_id) | ||
|
|
||
| def _require_active(self) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I'd expect require active to throw exceptions, if not active
| system_prompt: str | None = None, | ||
| tools: list[ToolSpec] | None = None, | ||
| messages: Messages | None = None, | ||
| **kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 all abstract methods should have **kwargs, that will allow us to extend these methods with more inputs later on
| if "project" in self.config: | ||
| headers.append(("OpenAI-Project", self.config["project"])) | ||
|
|
||
| websocket = await websockets.connect(url, additional_headers=headers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should websocket be in session initialization? though considering, we'll merge the two, it probably doesn't matter much
Description
This PR adds OpenAI Realtime API support to the Strands bidirectional streaming system, enabling real-time audio conversations. Users can now choose between Amazon Nova Sonic and OpenAI Realtime API providers.
Relevant documentation:
Files Added
src/strands/experimental/bidirectional_streaming/models/openai.pyOpenAI model provider implementation
Key additions:
OpenAIRealtimeBidirectionalModelclass for creating connectionsOpenAIRealtimeSessionclass for WebSocket session managementsrc/strands/experimental/bidirectional_streaming/tests/test_bidi_openai.pyIntegration test for OpenAI voice chat
Key features:
Files Modified
src/strands/experimental/bidirectional_streaming/types/bidirectional_streaming.pyExtended type definitions
Additions:
VoiceActivityEventtype for speech detection events (For OpenAI)UsageMetricsEventtype for token usage trackingRunning the Test Script
Prerequisites:
OPENAI_API_KEYenvironment variablepyaudiofor audio I/OCommands:
Expected behavior:
Related Issues
strands-agents#217
Documentation PR
Type of Change
New feature
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.