-
Notifications
You must be signed in to change notification settings - Fork 184
[FEATURE] Predict Stream #4187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Predict Stream #4187
Conversation
|
Will push more commit to add UTs |
...lgorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/AwsConnectorExecutor.java
Outdated
Show resolved
Hide resolved
...ms/src/main/java/org/opensearch/ml/engine/algorithms/remote/StreamPredictActionListener.java
Show resolved
Hide resolved
| if (t != null) { | ||
| log.error("Error: " + t.getMessage(), t); | ||
| if (t instanceof StreamResetException && t.getMessage().contains("NO_ERROR")) { | ||
| // TODO: reconnect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we planning to keep like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will keep it like this for current implementation. We can implement this as enhancement in the future
| @Override | ||
| public void onFailure(Exception e) { | ||
| try { | ||
| channel.sendResponse(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to completeStream here? What is the actual expectation during Failure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connection should be closed during failure and the error message should be displayed in regular HTTP response, not streaming. Working on enhancing this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The model validation check occurs before streaming start, so it will be send as a regular HTTP response. However, remote service error happens after the streaming connection is established, so it cannot be sent as a regular HTTP response. Current implementation is to send the remote service error message as one data chunk and close the connection. I put more details about the error handling in the PR description.
| @Getter | ||
| private StreamTransportService streamTransportService; | ||
|
|
||
| private AtomicBoolean isStreamClosed = new AtomicBoolean(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be shared across multiple streaming requests (I assume), which can cause race conditions between concurrent streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I moved the shared instance variable isStreamClosed to a local variable inside invokeRemoteServiceStream method. This way the flag will be local per request. I tested it with concurrent requests and confirmed this change eliminates race condition.
Lines 154 to 163 in dbd3f0a
| public void invokeRemoteServiceStream( | |
| String action, | |
| MLInput mlInput, | |
| Map<String, String> parameters, | |
| String payload, | |
| ExecutionContext executionContext, | |
| StreamPredictActionListener<MLTaskResponse, ?> actionListener | |
| ) { | |
| try { | |
| AtomicBoolean isStreamClosed = new AtomicBoolean(false); |
dbd3f0a to
b7de42f
Compare
Signed-off-by: Nathalie Jonathan <[email protected]>
|
I took a skim through the PR, I have a high level question, the implementation for openai and bedrock models are totally different. -Bedrock: Uses AWS SDK client (BedrockRuntimeAsyncClient) with native streaming This creates two completely different code paths that are hard to maintain and extend. Have you ever considering making a generic StreamingHandler interface and use factory pattern for the handler, so that we can a unified streaming client interface that can be implemented by different model providers. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4187 +/- ##
============================================
- Coverage 82.04% 81.73% -0.32%
- Complexity 9306 9372 +66
============================================
Files 792 796 +4
Lines 40001 40535 +534
Branches 4456 4508 +52
============================================
+ Hits 32818 33130 +312
- Misses 5263 5464 +201
- Partials 1920 1941 +21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java
Show resolved
Hide resolved
...thms/src/main/java/org/opensearch/ml/engine/algorithms/remote/HttpJsonConnectorExecutor.java
Show resolved
Hide resolved
common/src/main/java/org/opensearch/ml/common/connector/HttpConnector.java
Show resolved
Hide resolved
|
|
||
| private SdkAsyncHttpClient httpClient; | ||
|
|
||
| private BedrockRuntimeAsyncClient bedrockRuntimeAsyncClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this creates a hard dependency on Bedrock SDK for ALL AWS connectors, even if they're not using Bedrock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved all Bedrock SDK related imports and implementation to a separate BedrockStreamingHandler class and used reflection to access the class.
Lines 39 to 45 in cfe0cb4
| private static StreamingHandler createBedrockHandler(SdkAsyncHttpClient httpClient, Connector connector) { | |
| try { | |
| // Use reflection to avoid hard dependency | |
| Class<?> handlerClass = Class.forName("org.opensearch.ml.engine.algorithms.remote.streaming.BedrockStreamingHandler"); | |
| Constructor<?> constructor = handlerClass | |
| .getConstructor(SdkAsyncHttpClient.class, Class.forName("org.opensearch.ml.common.connector.AwsConnector")); | |
| return (StreamingHandler) constructor.newInstance(httpClient, connector); |
@mingshl Right, I agree with this. We can make a generic stream handler so it will be easier to extend this for other model providers. Created streaming handler factory in agent stream PR #4212 |
...lgorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/AwsConnectorExecutor.java
Show resolved
Hide resolved
...lgorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/AwsConnectorExecutor.java
Show resolved
Hide resolved
| if (payload != null) { | ||
| requestBody = okhttp3.RequestBody.create(payload, MediaType.parse("application/json; charset=utf-8")); | ||
| } else { | ||
| throw new IllegalArgumentException("Content length is 0. Aborting request to remote model"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens after that? What does customer see finally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connection will be closed and user will see this error message
{
"error": "Content length is 0. Aborting request to remote model"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a helpful message and also this is a 4xx error. At this point as customer I won't have any idea what to do to fix this issue.
...thms/src/main/java/org/opensearch/ml/engine/algorithms/remote/HttpJsonConnectorExecutor.java
Show resolved
Hide resolved
ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java
Show resolved
Hide resolved
ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java
Show resolved
Hide resolved
ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/remote/ConnectorUtils.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved to unblock agent stream PR.
Please add more unit test to increase coverage in new PR.
|
In the follow up PR let's address the comments and also improve the code coverage. Thanks |
|
Thanks, will address all comments and improve coverage in the agent stream PR |
Description
Add a new predict stream API as an experimental feature to support streaming predictions. The implementation currently utilized
transport-reactor-netty4for REST streaming. This is a temporary solution as we need to do more research to determine a long-term approach becausetransport-reactor-netty4from the OpenSearch core is still experimental.This predict stream API currently supports 2 models:
API Endpoint:
Sample workflow:
Error handling:
Error message will be sent as one chunk and the connection will be closed
Error message will be sent as one chunk and the connection will be closed
Related Issues
Resolves #3630
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.