Vector databases

VectorDB provides unified wrappers for five production-grade vector databases, each optimized for different deployment scenarios and feature requirements.

Supported backends

Pinecone

Managed cloud with namespace-based multi-tenancy and serverless scale

Weaviate

Cloud or self-hosted with hybrid search and native BM25

Chroma

Local or HTTP for development and lightweight prototyping

Milvus

Self-hosted or managed with partition-key multi-tenancy

Qdrant

Self-hosted or cloud with named vectors and quantization

Backend comparison

Backend	Client Type	Deployment	Best For
Pinecone	Managed cloud (GRPC)	Serverless / Pod-based	Namespace multi-tenancy, auto-scaling, minimal ops
Weaviate	Cloud or self-hosted	Managed / Docker / Kubernetes	Hybrid + generative search, native BM25, flexible schema
Chroma	Local or HTTP	In-process / Docker	Development, prototyping, local testing
Milvus / Zilliz	Self-hosted or managed	Docker / Kubernetes / Cloud	Partition-key multi-tenancy, large-scale infrastructure
Qdrant	Self-hosted or cloud	Docker / Kubernetes / Cloud	Named vectors, quantization, advanced payload filtering

Pinecone

Architecture

Pinecone uses lazy client initialization with GRPC transport for production-grade performance. Key features:

Serverless and pod-based deployment models
Namespace-based multi-tenancy (lightweight, per-query scoping)
Automatic metadata flattening for nested dictionaries
Batch processing with progress tracking
Support for dense and sparse (hybrid) vectors

File: src/vectordb/databases/pinecone.py

Connection

from vectordb.databases.pinecone import PineconeVectorDB

db = PineconeVectorDB(
    api_key="pc-xxx",
    index_name="docs"
)

Index creation

db.create_index(
    dimension=768,
    metric="cosine",
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

Multi-tenancy

Pinecone uses namespaces for logical data isolation:

# Upsert to tenant namespace
db.upsert(documents, namespace="tenant_1")

# Query within namespace
results = db.query(vector=embedding, namespace="tenant_1")

Metadata constraints

Pinecone requires scalar metadata values (str, int, float, bool) or lists of strings. Nested dictionaries are automatically flattened:

# Input metadata
{"user": {"id": 123, "name": "John"}}

# Flattened for Pinecone
{"user_id": 123, "user_name": "John"}

Weaviate

Architecture

Weaviate uses eager connection initialization with collection-centric design. Key features:

Native hybrid search (vector + BM25)
Generative search (RAG with OpenAI, Cohere, etc.)
Tenant-based multi-tenancy (full isolation)
Flexible schema with property types
Query-time reranking

File: src/vectordb/databases/weaviate.py

Connection

from vectordb.databases.weaviate import WeaviateVectorDB

db = WeaviateVectorDB(
    cluster_url="https://my-cluster.weaviate.cloud",
    api_key="weaviate-api-key"
)

Collection creation

db.create_collection(
    collection_name="Articles",
    enable_multi_tenancy=True
)

Multi-tenancy

Weaviate uses tenants for strong data isolation:

# Create tenants
db.create_tenants(["tenant_a", "tenant_b"])

# Switch context to tenant
db.with_tenant("tenant_a").upsert(documents)

# Query within tenant context
results = db.with_tenant("tenant_a").query(vector=embedding)

Hybrid search

Weaviate natively supports BM25 keyword ranking:

results = db.hybrid_search(
    query="artificial intelligence",
    vector=embedding,
    top_k=10,
    alpha=0.5  # 1.0 = vector only, 0.0 = BM25 only
)

Metadata filtering

Weaviate supports MongoDB-style filters:

results = db.query(
    vector=embedding,
    filters={
        "category": "tech",
        "published": {"$gte": "2024-01-01"}
    }
)

Chroma

Architecture

Chroma supports both in-process (ephemeral/persistent) and HTTP client modes. Key features:

Zero-setup local development
In-memory or persistent storage
HTTP server for remote access
Native metadata filtering
Embedding function integration

File: src/vectordb/databases/chroma.py

Connection

from vectordb.databases.chroma import ChromaVectorDB

# In-process persistent
db = ChromaVectorDB(persist_directory="./chroma_db")

# HTTP client
db = ChromaVectorDB(
    host="localhost",
    port=8000,
    client_type="http"
)

Collection management

db.create_collection(
    collection_name="documents",
    dimension=768,
    distance_metric="cosine"
)

Metadata filtering

results = db.query(
    vector=embedding,
    filters={"source": "documentation", "year": {"$gte": 2023}}
)

Milvus

Architecture

Milvus is designed for large-scale deployments with partition-key based multi-tenancy. Key features:

Partition-key multi-tenancy (schema-level partitioning)
Scalable infrastructure (horizontal scaling)
Multiple index types (IVF, HNSW, DiskANN)
Dynamic schema fields
GPU acceleration support

File: src/vectordb/databases/milvus.py

Connection

from vectordb.databases.milvus import MilvusVectorDB

db = MilvusVectorDB(
    host="localhost",
    port=19530
)

Collection with partition key

db.create_collection(
    collection_name="documents",
    dimension=768,
    partition_key_field="tenant_id"  # Enable multi-tenancy
)

Multi-tenancy

Milvus uses partition keys for tenant isolation:

# Documents with partition key metadata
documents = [
    {"id": "1", "vector": [...], "tenant_id": "tenant_a"},
    {"id": "2", "vector": [...], "tenant_id": "tenant_b"}
]

db.upsert(documents)

# Query filters by partition key automatically
results = db.query(
    vector=embedding,
    filters={"tenant_id": "tenant_a"}
)

Qdrant

Architecture

Qdrant supports named vectors and advanced payload filtering. Key features:

Named vectors (multiple embeddings per document)
Quantization for memory optimization
Rich payload filtering (nested JSON support)
Collection aliases
Snapshot support

File: src/vectordb/databases/qdrant.py

Connection

from vectordb.databases.qdrant import QdrantVectorDB

db = QdrantVectorDB(
    host="localhost",
    port=6333
)

Collection with named vectors

db.create_collection(
    collection_name="documents",
    vectors_config={
        "dense": {"size": 768, "distance": "Cosine"},
        "sparse": {"size": 30000, "distance": "Cosine"}
    }
)

Payload filtering

Qdrant supports nested JSON payloads:

results = db.query(
    vector=embedding,
    filters={
        "must": [
            {"key": "category", "match": {"value": "tech"}},
            {"key": "metadata.author", "match": {"value": "John"}}
        ]
    }
)

Feature matrix

Feature	Pinecone	Weaviate	Chroma	Milvus	Qdrant
Dense vectors	✓	✓	✓	✓	✓
Sparse vectors	✓	-	-	✓	✓
Hybrid search	✓	✓ (BM25)	-	✓	✓
Metadata filtering	✓	✓	✓	✓	✓
Multi-tenancy	Namespaces	Tenants	Metadata	Partition keys	Collections
Generative search	-	✓	-	-	-
Reranking	-	✓	-	-	-
Named vectors	-	-	-	-	✓
Quantization	-	-	-	✓	✓
Serverless	✓	✓	-	-	✓

Choosing a backend

When to use Pinecone

You want a fully managed serverless solution
You need namespace-based multi-tenancy
You prefer minimal operational overhead
You need auto-scaling for variable workloads

When to use Weaviate

You need native hybrid search (vector + BM25)
You want built-in generative AI capabilities
You need flexible schema with strong typing
You require tenant-based data isolation

When to use Chroma

You’re in development/prototyping phase
You need local testing without external dependencies
You want simple setup with minimal configuration
You have lightweight indexing requirements

When to use Milvus

You need partition-key based multi-tenancy
You have large-scale infrastructure requirements
You want fine-grained index type control (IVF, HNSW)
You need GPU acceleration for embedding generation

When to use Qdrant

You need named vectors (multiple embeddings per document)
You want quantization for memory optimization
You require rich nested payload filtering
You need snapshot/backup capabilities

Common patterns

All wrappers implement a consistent interface for common operations:

Index/collection creation

db.create_index(dimension=768, metric="cosine")  # Pinecone
db.create_collection(collection_name="docs")     # Weaviate, Chroma, Milvus, Qdrant

Document upsert

db.upsert(documents, namespace="tenant_1")  # All backends

Query

results = db.query(
    vector=embedding,
    top_k=10,
    filters={"category": "tech"}
)

Hybrid search

results = db.hybrid_search(
    query_embedding=dense_vector,
    query_sparse_embedding=sparse_vector,
    top_k=10
)

This consistency allows you to swap backends with minimal code changes, making it easy to benchmark different databases against your workload.

Getting Started

Core Concepts

Retrieval Features

Advanced RAG

Data Management

​Supported backends

Pinecone

Weaviate

Chroma

Milvus

Qdrant

​Backend comparison

​Pinecone

​Architecture

​Connection

​Index creation

​Multi-tenancy

​Metadata constraints

​Weaviate

​Architecture

​Connection

​Collection creation

​Multi-tenancy

​Hybrid search

​Metadata filtering

​Chroma

​Architecture

​Connection

​Collection management

​Metadata filtering

​Milvus

​Architecture

​Connection

​Collection with partition key

​Multi-tenancy

​Qdrant

​Architecture

​Connection

​Collection with named vectors

​Payload filtering

​Feature matrix

​Choosing a backend

​Common patterns

​Index/collection creation

​Document upsert

​Query

​Hybrid search

Build docs developers (and LLMs) love

Supported backends

Backend comparison

Pinecone

Architecture

Connection

Index creation

Multi-tenancy

Metadata constraints

Weaviate

Architecture

Connection

Collection creation

Multi-tenancy

Hybrid search

Metadata filtering

Chroma

Architecture

Connection

Collection management

Metadata filtering

Milvus

Architecture

Connection

Collection with partition key

Multi-tenancy

Qdrant

Architecture

Connection

Collection with named vectors

Payload filtering

Feature matrix

Choosing a backend

Common patterns

Index/collection creation

Document upsert

Query

Hybrid search