Skip to content

Conversation

zhichao-aws
Copy link
Member

Description

See #3865 for more details

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 7, 2025 09:03 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 7, 2025 09:03 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 7, 2025 09:03 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 7, 2025 09:03 — with GitHub Actions Error
@yuye-aws
Copy link
Member

yuye-aws commented Jul 8, 2025

@zhichao-aws The code looks OK to me. Can you provide some example calls and expected results so that other code reviewers can better understand the context?

@zhichao-aws
Copy link
Member Author

zhichao-aws commented Jul 9, 2025

@zhichao-aws The code looks OK to me. Can you provide some example calls and expected results so that other code reviewers can better understand the context?

Here are some example calls/response: (using sparse tokenizer)

POST /_ml/models/{{model_id}}/_predict
{
    "text_docs": ["hello world"],
    "parameters":{
        "sparse_embedding_format": "lexical"
    }
}

{
    "inference_results": [
        {
            "output": [
                {
                    "dataAsMap": {
                        "response": [
                            {
                                "world": 3.4208686,
                                "hello": 6.9377565
                            }
                        ]
                    }
                }
            ]
        }
    ]
}

POST /_ml/models/{{model_id}}/_predict
{
    "text_docs": ["hello world"],
    "parameters":{
        "sparse_embedding_format": "token_id"
    }
}


{
    "inference_results": [
        {
            "output": [
                {
                    "dataAsMap": {
                        "response": [
                            {
                                "2088": 3.4208686,
                                "7592": 6.9377565
                            }
                        ]
                    }
                }
            ]
        }
    ]
}

If parameters not set, it has the same behavior as before

@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 01:07 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 01:07 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 01:07 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 01:07 — with GitHub Actions Error
@xinyual
Copy link
Collaborator

xinyual commented Jul 9, 2025

LGTM.

xinyual
xinyual previously approved these changes Jul 9, 2025
zane-neo
zane-neo previously approved these changes Jul 9, 2025
@zane-neo zane-neo had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 08:06 — with GitHub Actions Error
@zane-neo zane-neo had a problem deploying to ml-commons-cicd-env-require-approval July 9, 2025 08:06 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 15, 2025 14:39 — with GitHub Actions Failure
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 15, 2025 14:39 — with GitHub Actions Inactive
@mingshl
Copy link
Collaborator

mingshl commented Jul 15, 2025

reran flaky test

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21
    [2025-07-14T11:52:01,321][INFO ][o.o.m.t.VisualizationsToolIT] [testVisualizationNotFound] after test

VisualizationsToolIT > testVisualizationNotFound STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21

VisualizationsToolIT > testVisualizationNotFound FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://[::1]:35175], URI [/_plugins/_ml/agents/nGoQCpgB7hs5uhBrpFNw/_execute], status line [HTTP/1.1 500 Internal Server Error]
    {"status":500,"error":{"type":"NotSerializableExceptionWrapper","reason":"System Error","details":"a_e_a_d_bad_tag_exception: Tag mismatch"}}
        at __randomizedtesting.SeedInfo.seed([A97A45F4EFD5E78D:6619E169E8E6DD8D]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:501)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:384)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:359)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:198)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:171)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:160)
        at app//org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound(VisualizationsToolIT.java:62)

@mingshl
Copy link
Collaborator

mingshl commented Jul 15, 2025

reran flaky test

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21
    [2025-07-14T11:52:01,321][INFO ][o.o.m.t.VisualizationsToolIT] [testVisualizationNotFound] after test

VisualizationsToolIT > testVisualizationNotFound STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21

VisualizationsToolIT > testVisualizationNotFound FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://[::1]:35175], URI [/_plugins/_ml/agents/nGoQCpgB7hs5uhBrpFNw/_execute], status line [HTTP/1.1 500 Internal Server Error]
    {"status":500,"error":{"type":"NotSerializableExceptionWrapper","reason":"System Error","details":"a_e_a_d_bad_tag_exception: Tag mismatch"}}
        at __randomizedtesting.SeedInfo.seed([A97A45F4EFD5E78D:6619E169E8E6DD8D]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:501)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:384)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:359)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:198)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:171)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:160)
        at app//org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound(VisualizationsToolIT.java:62)

related to this issue #2560 @Hailong-am

@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 16, 2025 05:48 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 16, 2025 05:48 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 16, 2025 05:52 — with GitHub Actions Failure
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 16, 2025 05:52 — with GitHub Actions Failure
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 16, 2025 05:52 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 16, 2025 05:52 — with GitHub Actions Error
@Hailong-am
Copy link
Contributor

reran flaky test

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21
    [2025-07-14T11:52:01,321][INFO ][o.o.m.t.VisualizationsToolIT] [testVisualizationNotFound] after test

VisualizationsToolIT > testVisualizationNotFound STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound' -Dtests.seed=A97A45F4EFD5E78D -Dtests.security.manager=false -Dtests.locale=ln-CD -Dtests.timezone=Canada/Saskatchewan -Druntime.java=21

VisualizationsToolIT > testVisualizationNotFound FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://[::1]:35175], URI [/_plugins/_ml/agents/nGoQCpgB7hs5uhBrpFNw/_execute], status line [HTTP/1.1 500 Internal Server Error]
    {"status":500,"error":{"type":"NotSerializableExceptionWrapper","reason":"System Error","details":"a_e_a_d_bad_tag_exception: Tag mismatch"}}
        at __randomizedtesting.SeedInfo.seed([A97A45F4EFD5E78D:6619E169E8E6DD8D]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:501)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:384)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:359)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:198)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:171)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:160)
        at app//org.opensearch.ml.tools.VisualizationsToolIT.testVisualizationNotFound(VisualizationsToolIT.java:62)

related to this issue #2560 @Hailong-am

@mingshl This is a new issue. I have checked the logs from https://github.com/opensearch-project/ml-commons/actions/runs/16163806560/job/45694480886, the actual error is

Caused by: org.opensearch.core.common.io.stream.NotSerializableExceptionWrapper: a_e_a_d_bad_tag_exception: Tag mismatch
»  	at com.sun.crypto.provider.GaloisCounterMode$GCMDecrypt.doFinal(GaloisCounterMode.java:1545) ~[?:?]
»  	at com.sun.crypto.provider.GaloisCounterMode.engineDoFinal(GaloisCounterMode.java:417) ~[?:?]
»  	at javax.crypto.Cipher.doFinal(Cipher.java:2244) ~[?:?]
»  	at com.amazonaws.encryptionsdk.internal.JceKeyCipher.decryptKey(JceKeyCipher.java:129) ~[?:?]
»  	at com.amazonaws.encryptionsdk.jce.JceMasterKey.decryptDataKey(JceMasterKey.java:165) ~[?:?]

checking the code of this, this happens when execute the agent, it will decrypt the credentials for connector. My first assumption is the master key has changed or not synced between different nodes, that will need further investigation.

Copy link

codecov bot commented Jul 17, 2025

Codecov Report

Attention: Patch coverage is 90.52632% with 9 lines in your changes missing coverage. Please review.

Project coverage is 80.66%. Comparing base (93bc9a3) to head (1157edf).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...xtembedding/AsymmetricTextEmbeddingParameters.java 75.67% 4 Missing and 5 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3963      +/-   ##
============================================
+ Coverage     80.64%   80.66%   +0.01%     
- Complexity     7976     7994      +18     
============================================
  Files           694      695       +1     
  Lines         34938    35002      +64     
  Branches       3899     3919      +20     
============================================
+ Hits          28177    28233      +56     
- Misses         5030     5035       +5     
- Partials       1731     1734       +3     
Flag Coverage Δ
ml-commons 80.66% <90.52%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Hailong-am
Copy link
Contributor

checking the code of this, this happens when execute the agent, it will decrypt the credentials for connector. My first assumption is the master key has changed or not synced between different nodes, that will need further investigation.

@mingshl raised a PR to fix the test failure #3989

@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 02:30 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 02:30 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 02:30 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 02:30 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 18, 2025 03:42 — with GitHub Actions Error
@zhichao-aws zhichao-aws had a problem deploying to ml-commons-cicd-env-require-approval July 18, 2025 03:42 — with GitHub Actions Failure
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 06:45 — with GitHub Actions Inactive
@zhichao-aws zhichao-aws temporarily deployed to ml-commons-cicd-env-require-approval July 18, 2025 06:45 — with GitHub Actions Inactive
@xinyual xinyual merged commit 85bbcb1 into opensearch-project:main Jul 21, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants