[FEATURE] Agent Execute Stream #4212

nathaliellenaa · 2025-09-26T23:17:58Z

Description

Add a new agent execute stream API as an experimental feature to support agent streaming.

Supported agent - model for this agent execute stream API:

Conversational agent - OpenAI Chat Completion model
Conversational agent - Bedrock Converse Stream model

Note: This PR depends on the predict stream implementation #4187

API Endpoint:

POST /_plugins/_ml/agents/{agent_id}/_execute/stream

Sample workflow:

// Enable feature flag
PUT /_cluster/settings
{
    "persistent": {
        "plugins.ml_commons.stream_enabled": true
    }
}

// Register OpenAI chat completion model
POST /_plugins/_ml/models/_register
{
    "name": "openai gpt 3.5 turbo",
    "function_name": "remote",
    "description": "openai model",
    "connector": {
        "name": "OpenAI Chat Connector",
        "description": "The connector to public OpenAI model service for GPT 3.5",
        "version": 1,
        "protocol": "http",
        "parameters": {
            "endpoint": "api.openai.com",
            "model": "gpt-3.5-turbo"
        },
        "credential": {
            "openAI_key": "<your_api_key>"
        },
        "actions": [{
            "action_type": "predict",
            "method": "POST",
            "url": "https://${parameters.endpoint}/v1/chat/completions",
            "headers": {
                "Authorization": "Bearer ${credential.openAI_key}"
            },
            "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": [{\"role\":\"developer\",\"content\":\"${parameters.system_prompt}\"},${parameters._chat_history:-}{\"role\":\"user\",\"content\":\"${parameters.prompt}\"}${parameters._interactions:-}]${parameters.tool_configs:-} }"
        }]
    }
}

// Register conversational agent using OpenAI model
POST /_plugins/_ml/agents/_register
{
    "name": "Chat Agent with RAG",
    "type": "conversational",
    "description": "this is a test agent",
    "llm": {
        "model_id": "<model id created in previous step>",
        "parameters": {
            "max_iteration": 5,
            "system_prompt": "You are a helpful assistant. You are able to assist with a wide range of tasks, from answering simple questions to providing in-depth explanations and discussions on a wide range of topics.\nIf the question is complex, you will split it into several smaller questions, and solve them one by one. For example, the original question is:\nhow many orders in last three month? Which month has highest?\nYou will spit into several smaller questions:\n1.Calculate total orders of last three month.\n2.Calculate monthly total order of last three month and calculate which months order is highest. You MUST use the available tools everytime to answer the question",
            "prompt": "${parameters.question}"
        }
    },
    "memory": {
        "type": "conversation_index"
    },
    "parameters": {
        "_llm_interface": "openai/v1/chat/completions"
    },
    "tools": [
        {
            "type": "IndexMappingTool",
            "name": "DemoIndexMappingTool",
            "parameters": {
                "index": "${parameters.index}",
                "input": "${parameters.question}"
            }
        },
        {
            "type": "ListIndexTool",
            "name": "RetrieveIndexMetaTool",
            "description": "Use this tool to get OpenSearch index information: (health, status, index, uuid, primary count, replica count, docs.count, docs.deleted, store.size, primary.store.size)."
        }
    ],
    "app_type": "chat_with_rag"
}

// Run agent execute stream API
POST /_plugins/_ml/agents/evCIh5kB66VN-aC_0aNf/_execute/stream
{
    "parameters": {
        "question": "How many indices are in my cluster?"
    }
}

// Sample response
data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"[{\"index\":0.0,\"id\":\"call_HjpbrbdQFHK0omPYa6m2DCot\",\"type\":\"function\",\"function\":{\"name\":\"RetrieveIndexMetaTool\",\"arguments\":\"\"}}]","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"[{\"index\":0.0,\"function\":{\"arguments\":\"{}\"}}]","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"{\"choices\":[{\"message\":{\"tool_calls\":[{\"type\":\"function\",\"function\":{\"name\":\"RetrieveIndexMetaTool\",\"arguments\":\"{}\"},\"id\":\"call_HjpbrbdQFHK0omPYa6m2DCot\"}]},\"finish_reason\":\"tool_calls\"}]}","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"row,health,status,index,uuid,pri(number of primary shards),rep(number of replica shards),docs.count(number of available documents),docs.deleted(number of deleted documents),store.size(store size of primary and replica shards),pri.store.size(store size of primary shards)\n1,green,open,.plugins-ml-model-group,Msb1Y4W5QeiLs5yUQi-VRg,1,1,2,0,17.1kb,5.9kb\n2,green,open,.plugins-ml-memory-message,1IWd1HPeSWmM29qE6rcj_A,1,1,658,0,636.4kb,313.5kb\n3,green,open,.plugins-ml-memory-meta,OETb21fqQJa3Y2hGQbknCQ,1,1,267,7,188kb,93.9kb\n4,green,open,.plugins-ml-config,0mnOWX5gSX2s-yP27zPFNw,1,1,1,0,8.1kb,4kb\n5,green,open,.plugins-ml-model,evYOOKN4QPqtmUjxsDwJYA,1,1,5,5,421.5kb,210.7kb\n6,green,open,.plugins-ml-agent,I0SpBovjT3C6NABCBzGiiQ,1,1,6,0,205.5kb,111.3kb\n7,green,open,.plugins-ml-task,_Urzn9gdSuCRqUaYAFaD_Q,1,1,100,4,136.1kb,45.3kb\n8,green,open,top_queries-2025.09.26-00444,jb7Q1FiLSl-wTxjdSUKs_w,1,1,1736,126,1.8mb,988kb\n9,green,open,.plugins-ml-connector,YaJORo4jT0Ksp24L5cW1uA,1,1,2,0,97.8kb,48.9kb\n","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"There","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" are","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" ","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"9","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" indices","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" in","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" your","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":" cluster","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":".","is_last":false}}]}]}

data: {"inference_results":[{"output":[{"name":"memory_id","result":"LvU1iJkBCzHrriq5hXbN"},{"name":"parent_interaction_id","result":"L_U1iJkBCzHrriq5hXbs"},{"name":"response","dataAsMap":{"content":"","is_last":true}}]}]}

Error handling:
[In progress]

Related Issues

Resolves #3630

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

dhrubo-os · 2025-09-30T00:23:30Z

resolve the conflict. Rebase from main?

nathaliellenaa · 2025-09-30T00:27:22Z

Will push another commit to address comments from predict stream PR. Just cleaned up this PR to only include commit related to agent stream

pyek-bot · 2025-10-01T19:16:23Z

@nathaliellenaa cool, it's a separate API anyways, in the future we should default to streaming for both predict and agents, in that case we can expose 2 flags

Signed-off-by: Nathalie Jonathan <[email protected]>

nathaliellenaa · 2025-10-01T20:15:39Z

Failing CI seems unrelated, run both IT locally and they're passing

RestCohereInferenceIT > test_cohereInference_withDifferent_postProcessFunction FAILED
    java.lang.AssertionError: failed to run test with test name: connector.post_process.cohere_v2.embedding.float_test
        at __randomizedtesting.SeedInfo.seed([EA9F7EF19FCC2013:1911E683B8CE553B]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.opensearch.ml.rest.RestCohereInferenceIT.test_cohereInference_withDifferent_postProcessFunction(RestCohereInferenceIT.java:77)


RestMLRAGSearchProcessorIT > testBM25WithBedrockConverse FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://[::1]:33109], URI [/test/_search?size=5&search_pipeline=pipeline_test], status line [HTTP/1.1 429 Too Many Requests]
    {"error":{"root_cause":[{"type":"remote_connector_throttling_exception","reason":"Error from remote service: The request was denied due to remote server throttling. To change the retry policy and behavior, please update the connector client_config."}],"type":"remote_connector_throttling_exception","reason":"Error from remote service: The request was denied due to remote server throttling. To change the retry policy and behavior, please update the connector client_config."},"status":429}

dhrubo-os · 2025-10-01T20:42:19Z

approving to merge the PR by 2:00 PM.

nathaliellenaa requested review from HenryL27, Zhangxunmt, austintlee, b4sjoo, dhrubo-os, jngz-es, mingshl, model-collapse, pyek-bot, rbhavna, sam-herman, xinyual, ylwu-amzn and zane-neo as code owners September 26, 2025 23:17

nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval September 26, 2025 23:19 — with GitHub Actions Error

nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval September 26, 2025 23:19 — with GitHub Actions Failure

nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval September 26, 2025 23:19 — with GitHub Actions Error

nathaliellenaa had a problem deploying to ml-commons-cicd-env-require-approval September 26, 2025 23:19 — with GitHub Actions Failure

nathaliellenaa force-pushed the stream_agent branch from 909cdcb to 1c7f7ad Compare September 30, 2025 00:19

nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval September 30, 2025 00:21 — with GitHub Actions Waiting

nathaliellenaa force-pushed the stream_agent branch from 1c7f7ad to 2ae1e44 Compare September 30, 2025 00:26

nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval September 30, 2025 00:28 — with GitHub Actions Waiting

nathaliellenaa requested a deployment to ml-commons-cicd-env-require-approval October 1, 2025 19:00 — with GitHub Actions Waiting

nathaliellenaa added 10 commits October 1, 2025 12:28

Initial commit for agent streaming

77b6f29

Signed-off-by: Nathalie Jonathan <[email protected]>

Create streaming handler factory, address comments

b25ee0b

Signed-off-by: Nathalie Jonathan <[email protected]>

Address more comments

db2b1a8

Signed-off-by: Nathalie Jonathan <[email protected]>

Fix failing tests

37d1c41

Signed-off-by: Nathalie Jonathan <[email protected]>

Address comments, add some tests

232e56f

Signed-off-by: Nathalie Jonathan <[email protected]>

clean up agent runner

bca6886

Signed-off-by: Nathalie Jonathan <[email protected]>

Address comments

df9d64f

Signed-off-by: Nathalie Jonathan <[email protected]>

Address comments, add more tests

2de1757

Signed-off-by: Nathalie Jonathan <[email protected]>

Fix test after rebase

a7f4473

Signed-off-by: Nathalie Jonathan <[email protected]>

Fix failing tests

492a4c3

Signed-off-by: Nathalie Jonathan <[email protected]>

ylwu-amzn force-pushed the stream_agent branch from e0ee428 to 492a4c3 Compare October 1, 2025 19:28

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 19:32 — with GitHub Actions Failure

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 19:32 — with GitHub Actions Error

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 19:32 — with GitHub Actions Failure

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 19:32 — with GitHub Actions Error

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 20:19 — with GitHub Actions Failure

ylwu-amzn had a problem deploying to ml-commons-cicd-env-require-approval October 1, 2025 20:19 — with GitHub Actions Error

ylwu-amzn approved these changes Oct 1, 2025

View reviewed changes

dhrubo-os approved these changes Oct 1, 2025

View reviewed changes

ylwu-amzn merged commit b9b5687 into opensearch-project:main Oct 1, 2025
9 of 17 checks passed

jiapingzeng mentioned this pull request Oct 21, 2025

combine json chunks from requests #4317

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FEATURE] Agent Execute Stream #4212

[FEATURE] Agent Execute Stream #4212

Uh oh!

nathaliellenaa commented Sep 26, 2025 •

edited

Loading

Uh oh!

dhrubo-os commented Sep 30, 2025

Uh oh!

nathaliellenaa commented Sep 30, 2025

Uh oh!

pyek-bot commented Oct 1, 2025

Uh oh!

nathaliellenaa commented Oct 1, 2025

Uh oh!

dhrubo-os commented Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[FEATURE] Agent Execute Stream #4212

[FEATURE] Agent Execute Stream #4212

Uh oh!

Conversation

nathaliellenaa commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

dhrubo-os commented Sep 30, 2025

Uh oh!

nathaliellenaa commented Sep 30, 2025

Uh oh!

pyek-bot commented Oct 1, 2025

Uh oh!

nathaliellenaa commented Oct 1, 2025

Uh oh!

dhrubo-os commented Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

nathaliellenaa commented Sep 26, 2025 •

edited

Loading