Skip to content

[BUG] Wrong database type checking for embedding_column in PGVectorStore.create #241

@NanoClem

Description

@NanoClem

Hello !

I am trying to use the latest features of langchain-postgres as they perfectly answer my need of reusing a pre-existing table.

I think I have stumbled upon a bug during the instanciation of PGVectorStore (AsyncPGVectorStore) through its create class method implemented in langchain_postgres/v2/async_vectorstore.py.

Specs

I am running my database with docker using pgvector/pgvector:pg17 image.

Python specs :

  • python 3.10.12
  • langchain>=0.3.27
  • langchain-postgres>=0.0.15
  • psycopg[binary]>=3.2.9
  • pgvector<0.4
  • asyncpg>=0.30.0

Issue

If I change "USER-DEFINED" to "vector", it works without any issues.
It looks like a good idea, but as I am not part of the dev team there must be things I am missing.

# langchain_postgres/v2/async_vectorstore.py

class AsyncPGVectorStore(VectorStore):
    # ...

    @classmethod
    async def create(self, ...) -> AsyncPGVectorStore:
        # ...
        
        # sql query columns data from information_schema
        stmt = "SELECT column_name, data_type FROM information_schema.columns WHERE table_name = :table_name AND table_schema = :schema_name"
        async with engine._pool.connect() as conn:
            result = await conn.execute(
                text(stmt),
                {"table_name": table_name, "schema_name": schema_name},
            )
            result_map = result.mappings()
            results = result_map.fetchall()
        columns = {}
        for field in results:
            columns[field["column_name"]] = field["data_type"]  # data_type is "vector" for embedding_column

        # ...

        if columns[embedding_column] != "USER-DEFINED":   # different from information_schema
            raise ValueError(
                f"Embedding column, {embedding_column}, is not type Vector."
            )

        # ...

Thanks for the work put in the library !

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions