Skip to content

Conversation

tkykenmt
Copy link
Contributor

@tkykenmt tkykenmt commented Oct 7, 2025

Description

add bluprint examples for semantic search using cohere embed v4

Related Issues

Resolves #4271

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@tkykenmt tkykenmt requested a deployment to ml-commons-cicd-env-require-approval October 7, 2025 14:07 — with GitHub Actions Waiting
@tkykenmt tkykenmt requested a deployment to ml-commons-cicd-env-require-approval October 7, 2025 14:07 — with GitHub Actions Waiting
@tkykenmt tkykenmt requested a deployment to ml-commons-cicd-env-require-approval October 7, 2025 14:07 — with GitHub Actions Waiting
@tkykenmt tkykenmt requested a deployment to ml-commons-cicd-env-require-approval October 7, 2025 14:07 — with GitHub Actions Waiting
@mingshl
Copy link
Collaborator

mingshl commented Oct 7, 2025

@tkykenmt are you trying to write post processing function to tune the model output the same format as local model output?

@tkykenmt
Copy link
Contributor Author

tkykenmt commented Oct 8, 2025

@mingshl

result of cohere embed v4 is different from v3, so existing post processing function does not work as well. So I write custom post processing function based on existing similar blueprint.

https://docs.opensearch.org/latest/tutorials/vector-search/vector-operations/semantic-search-byte-vectors/

It introduces using quantized vector from Cohere Embed v3, and can be adopted v4 with some changes.

@mingshl
Copy link
Collaborator

mingshl commented Oct 8, 2025

@tkykenmt will you consider making a new post processing function? That sounds convenient and also similar to previous blueprint

earlier, these are

            "pre_process_function": "connector.pre_process.cohere.embedding",
            "post_process_function": "connector.post_process.cohere.embedding"

but it doesn't work for v4 model.

how about creating an new set?

            "pre_process_function": "connector.pre_process.cohere.embedding_v4",
            "post_process_function": "connector.post_process.cohere.embedding_v4"

or please suggest new names?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DOC] Add blueprints for Cohere Embed v4 on Amazon Bedrock

2 participants