Pass Standard Tests #35

fileames · 2025-09-19T13:56:41Z

This PR makes the necessary changes to make sure our integrations pass the standard tests offered in langchain-tests.

Changes include:

Previously, inserting documents with duplicate IDs could raise a unique constraint error and fail the entire batch. We now use batcherrors=True (https://python-oracledb.readthedocs.io/en/latest/user_guide/batch_statement.html#handling-data-errors ) so per-row errors don’t invalidate other inserts. Only successfully inserted IDs are returned.
Optional upsert behavior: Standard tests expect rows with duplicate IDs to be updated rather than erroring. To preserve backward compatibility, we introduced a constructor parameter mutate_on_duplicate:
False (default): preserve previous behavior (no updates on duplicate IDs).
True: update existing rows (texts, metadata, etc.) when duplicate IDs are provided.
New methods: Added get_by_ids and aget_by_ids.
ID handling and hashing
- In our current implementation, when IDs aren’t provided on add_texts, we generate them via uuid.uuid4() and store a hashed version in a RAW column. Users need these generated ids to use in delete or get_by_ids. To enable this add_texts is expected to return these generated ids.
- However, we return the hashed versions, which does not work given in delete or get_by_ids as we hash them again to search in the documents:

original_documents = [
    Document(page_content="foo1", metadata={"id": "1"}),
    Document(page_content="bar2", metadata={"id": "2"}),
]
ids = store.add_documents(original_documents)
store.delete(ids)

assert len(store.similarity_search("foo", k=10)) == 0 # FAILS

This behaviour is fixed to return the unhashed versions.
similarity_search functions returned Documents did not have the id field as we did not have the original unhashed ids not saved to DB. To keep the table structure same for users with existing tables, these original ids are added to the metadata with the key "__orcl_internal_doc_id", which is then used to return Documents including the id fields.

fileames · 2025-09-29T10:47:21Z

Hi @cjbj, if you have any comments, I'd be happy to address

fileames added 3 commits September 19, 2025 13:54

Make changes to pass standard tests

725d478

Add standard tests

9ccf755

Version change

5a48743

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 19, 2025

Use values clause in MERGE and add more comments

daff6b2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pass Standard Tests #35

Pass Standard Tests #35

Uh oh!

fileames commented Sep 19, 2025

Uh oh!

fileames commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Pass Standard Tests #35

Are you sure you want to change the base?

Pass Standard Tests #35

Uh oh!

Conversation

fileames commented Sep 19, 2025

Uh oh!

fileames commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant