This project provides a FastAPI-based API to manage vector database operations, including creating indexes, inserting (upserting) data, and performing similarity or ID-based searches. Currently, it supports Pinecone as the vector database provider.
- Create vector indexes in Pinecone.
- Upsert (insert or update) data into the vector database.
- Search for similar vectors or specific IDs with metadata filtering.
- Python 3.12
- FastAPI == 0.109.1
- Pinecone Client == 3.0.0
- Pydantic == 2.5.2
- Uvicorn == 0.24.0
- python-dotenv
- Docker (opcional)
- Construir la imagen:
docker build -t vector-db-api .
- Ejecutar el contenedor:
docker run -d -p 9000:9000 --name vector-db-container vector-db-api
- Verificar que está funcionando:
- La API estará disponible en: http://localhost:9000/docs
- Comandos útiles de Docker:
# Ver logs del contenedor
docker logs vector-db-container
# Detener el contenedor
docker stop vector-db-container
# Eliminar el contenedor
docker rm vector-db-container
Install the dependencies using:
pip install -r requirements.txt
The following environment variables are required:
PINECONE_API_KEY
: Your Pinecone API key.OPENAI_API_KEY
: Your OpenAI API key.
CHUNK_SIZE
: Size of text chunks for splitting (default: 1000)CHUNK_OVERLAP
: Overlap between chunks (default: 200)CHUNK_THRESHOLD
: Minimum text length to trigger splitting (default: 1000)OPENAI_EMBEDDING_MODEL
: OpenAI embedding model to use (default: text-embedding-3-small)
The API automatically splits long texts using a recursive character splitter to:
- Maintain semantic coherence
- Optimize embedding quality
- Handle large documents efficiently
- Preserve metadata across chunks
Uses text-embedding-3-small for optimal performance:
- Cost-effective: ~10x cheaper than alternatives
- High quality: Superior semantic search results
- 1536 dimensions: Perfect balance of performance and accuracy
- 8,191 token limit: Handles large texts efficiently
Texts longer than 1000 characters are automatically split into manageable chunks with proper metadata tracking.
The API implements a "replace" strategy instead of simple append:
- Clean existing data: Before inserting new chunks, all existing chunks for the same document ID are automatically deleted
- Prevent orphaned chunks: No leftover chunks from previous versions of the same document
- Unique chunk IDs: Each chunk gets a unique ID with timestamp to prevent collisions
- Metadata consistency: All vectors (chunked or not) include
original_id
,chunk_index
,total_chunks
, andcreated_at
metadata
Example workflow:
- Document "doc1" first upload →
doc1_chunk_0_1234567890
,doc1_chunk_1_1234567890
- Document "doc1" update → Automatically deletes previous chunks, creates new ones
- No manual cleanup needed
Dedicated namespace service for optimal performance:
POST /api/namespace/ensure_namespace
- Create/verify namespace before operationsGET /api/namespace/namespace_stats/{index}/{namespace}
- Get namespace statisticsPOST /api/namespace/validate_namespace
- Validate namespace format
- Separated responsibilities: Namespace operations are independent from upsert
- Optional validation: Validate only when needed, not on every operation
- Faster upserts: Removed heavy validations from critical path
- Lazy creation: Namespaces created automatically during first upsert
Usage pattern:
# Optional: Pre-validate namespace (recommended for production)
POST /api/namespace/ensure_namespace
{
"index_name": "my-index",
"namespace": "production"
}
# Then proceed with fast upserts
POST /api/ms/vector-db/upsert_data/pinecone/my-index
Validation rules:
- ✅ Valid:
"production"
,"test-env"
,"user_123"
- ❌ Invalid:
"test space"
,"special@chars"
Run the FastAPI server using Uvicorn:
uvicorn main:app --reload
Para probar los endpoints más fácilmente, puedes usar nuestra colección de Postman: Vector DB API Collection
Creates a vector index in Pinecone.
curl --location 'http://localhost:8000/vector-db/create_index/pinecone' \
--header 'Content-Type: application/json' \
--data '{
"index_name": "startup",
"dimension": 1536,
"metric": "cosine",
"cloud": "aws",
"region": "us-east-1"
}'
Note: Use 1536
dimensions for compatibility with OpenAI's text-embedding-3-small model.
Upserts (inserts or updates) data into the specified namespace of a Pinecone index.
curl --location 'http://localhost:8000/vector-db/upsert_data/pinecone/startup' \
--header 'Content-Type: application/json' \
--data '{
"namespace": "products",
"records": [
{
"id": "pc1",
"data": {"name": "computadora mac", "price": 213131.03131},
"metadata": {"tags": ["mac"], "category": "tech", "price": 2000, "id": "pc1"}
},
{
"id": "pc2",
"data": {"name": "computadora linux", "price": 200.03131},
"metadata": {"tags": ["linux"], "category": "tech", "price": 1000, "id": "pc2"}
}
]
}'
Performs a similarity search with optional metadata filters.
curl --location 'http://localhost:8000/vector-db/search/pinecone/startup' \
--header 'Content-Type: application/json' \
--data '{
"query": "quiero comprar un pc",
"top_k": 2,
"ids": ["pc1", "pc2"],
"namespace": "products",
"metadata_filter": {
"category": {
"$eq": "tech"
},
"price": {
"$gt": 200
},
"id": {
"$eq": "pc1"
}
}
}'
This project is licensed under the MIT License.