Skip to main content
VectorDB provides unified wrappers for five production-grade vector databases, each optimized for different deployment scenarios and feature requirements.

Supported backends

Pinecone

Managed cloud with namespace-based multi-tenancy and serverless scale

Weaviate

Cloud or self-hosted with hybrid search and native BM25

Chroma

Local or HTTP for development and lightweight prototyping

Milvus

Self-hosted or managed with partition-key multi-tenancy

Qdrant

Self-hosted or cloud with named vectors and quantization

Backend comparison

BackendClient TypeDeploymentBest For
PineconeManaged cloud (GRPC)Serverless / Pod-basedNamespace multi-tenancy, auto-scaling, minimal ops
WeaviateCloud or self-hostedManaged / Docker / KubernetesHybrid + generative search, native BM25, flexible schema
ChromaLocal or HTTPIn-process / DockerDevelopment, prototyping, local testing
Milvus / ZillizSelf-hosted or managedDocker / Kubernetes / CloudPartition-key multi-tenancy, large-scale infrastructure
QdrantSelf-hosted or cloudDocker / Kubernetes / CloudNamed vectors, quantization, advanced payload filtering

Pinecone

Architecture

Pinecone uses lazy client initialization with GRPC transport for production-grade performance. Key features:
  • Serverless and pod-based deployment models
  • Namespace-based multi-tenancy (lightweight, per-query scoping)
  • Automatic metadata flattening for nested dictionaries
  • Batch processing with progress tracking
  • Support for dense and sparse (hybrid) vectors
File: src/vectordb/databases/pinecone.py

Connection

from vectordb.databases.pinecone import PineconeVectorDB

db = PineconeVectorDB(
    api_key="pc-xxx",
    index_name="docs"
)

Index creation

db.create_index(
    dimension=768,
    metric="cosine",
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

Multi-tenancy

Pinecone uses namespaces for logical data isolation:
# Upsert to tenant namespace
db.upsert(documents, namespace="tenant_1")

# Query within namespace
results = db.query(vector=embedding, namespace="tenant_1")

Metadata constraints

Pinecone requires scalar metadata values (str, int, float, bool) or lists of strings. Nested dictionaries are automatically flattened:
# Input metadata
{"user": {"id": 123, "name": "John"}}

# Flattened for Pinecone
{"user_id": 123, "user_name": "John"}

Weaviate

Architecture

Weaviate uses eager connection initialization with collection-centric design. Key features:
  • Native hybrid search (vector + BM25)
  • Generative search (RAG with OpenAI, Cohere, etc.)
  • Tenant-based multi-tenancy (full isolation)
  • Flexible schema with property types
  • Query-time reranking
File: src/vectordb/databases/weaviate.py

Connection

from vectordb.databases.weaviate import WeaviateVectorDB

db = WeaviateVectorDB(
    cluster_url="https://my-cluster.weaviate.cloud",
    api_key="weaviate-api-key"
)

Collection creation

db.create_collection(
    collection_name="Articles",
    enable_multi_tenancy=True
)

Multi-tenancy

Weaviate uses tenants for strong data isolation:
# Create tenants
db.create_tenants(["tenant_a", "tenant_b"])

# Switch context to tenant
db.with_tenant("tenant_a").upsert(documents)

# Query within tenant context
results = db.with_tenant("tenant_a").query(vector=embedding)
Weaviate natively supports BM25 keyword ranking:
results = db.hybrid_search(
    query="artificial intelligence",
    vector=embedding,
    top_k=10,
    alpha=0.5  # 1.0 = vector only, 0.0 = BM25 only
)

Metadata filtering

Weaviate supports MongoDB-style filters:
results = db.query(
    vector=embedding,
    filters={
        "category": "tech",
        "published": {"$gte": "2024-01-01"}
    }
)

Chroma

Architecture

Chroma supports both in-process (ephemeral/persistent) and HTTP client modes. Key features:
  • Zero-setup local development
  • In-memory or persistent storage
  • HTTP server for remote access
  • Native metadata filtering
  • Embedding function integration
File: src/vectordb/databases/chroma.py

Connection

from vectordb.databases.chroma import ChromaVectorDB

# In-process persistent
db = ChromaVectorDB(persist_directory="./chroma_db")

# HTTP client
db = ChromaVectorDB(
    host="localhost",
    port=8000,
    client_type="http"
)

Collection management

db.create_collection(
    collection_name="documents",
    dimension=768,
    distance_metric="cosine"
)

Metadata filtering

results = db.query(
    vector=embedding,
    filters={"source": "documentation", "year": {"$gte": 2023}}
)

Milvus

Architecture

Milvus is designed for large-scale deployments with partition-key based multi-tenancy. Key features:
  • Partition-key multi-tenancy (schema-level partitioning)
  • Scalable infrastructure (horizontal scaling)
  • Multiple index types (IVF, HNSW, DiskANN)
  • Dynamic schema fields
  • GPU acceleration support
File: src/vectordb/databases/milvus.py

Connection

from vectordb.databases.milvus import MilvusVectorDB

db = MilvusVectorDB(
    host="localhost",
    port=19530
)

Collection with partition key

db.create_collection(
    collection_name="documents",
    dimension=768,
    partition_key_field="tenant_id"  # Enable multi-tenancy
)

Multi-tenancy

Milvus uses partition keys for tenant isolation:
# Documents with partition key metadata
documents = [
    {"id": "1", "vector": [...], "tenant_id": "tenant_a"},
    {"id": "2", "vector": [...], "tenant_id": "tenant_b"}
]

db.upsert(documents)

# Query filters by partition key automatically
results = db.query(
    vector=embedding,
    filters={"tenant_id": "tenant_a"}
)

Qdrant

Architecture

Qdrant supports named vectors and advanced payload filtering. Key features:
  • Named vectors (multiple embeddings per document)
  • Quantization for memory optimization
  • Rich payload filtering (nested JSON support)
  • Collection aliases
  • Snapshot support
File: src/vectordb/databases/qdrant.py

Connection

from vectordb.databases.qdrant import QdrantVectorDB

db = QdrantVectorDB(
    host="localhost",
    port=6333
)

Collection with named vectors

db.create_collection(
    collection_name="documents",
    vectors_config={
        "dense": {"size": 768, "distance": "Cosine"},
        "sparse": {"size": 30000, "distance": "Cosine"}
    }
)

Payload filtering

Qdrant supports nested JSON payloads:
results = db.query(
    vector=embedding,
    filters={
        "must": [
            {"key": "category", "match": {"value": "tech"}},
            {"key": "metadata.author", "match": {"value": "John"}}
        ]
    }
)

Feature matrix

FeaturePineconeWeaviateChromaMilvusQdrant
Dense vectors
Sparse vectors--
Hybrid search✓ (BM25)-
Metadata filtering
Multi-tenancyNamespacesTenantsMetadataPartition keysCollections
Generative search----
Reranking----
Named vectors----
Quantization---
Serverless--

Choosing a backend

  • You want a fully managed serverless solution
  • You need namespace-based multi-tenancy
  • You prefer minimal operational overhead
  • You need auto-scaling for variable workloads
  • You need native hybrid search (vector + BM25)
  • You want built-in generative AI capabilities
  • You need flexible schema with strong typing
  • You require tenant-based data isolation
  • You’re in development/prototyping phase
  • You need local testing without external dependencies
  • You want simple setup with minimal configuration
  • You have lightweight indexing requirements
  • You need partition-key based multi-tenancy
  • You have large-scale infrastructure requirements
  • You want fine-grained index type control (IVF, HNSW)
  • You need GPU acceleration for embedding generation
  • You need named vectors (multiple embeddings per document)
  • You want quantization for memory optimization
  • You require rich nested payload filtering
  • You need snapshot/backup capabilities

Common patterns

All wrappers implement a consistent interface for common operations:

Index/collection creation

db.create_index(dimension=768, metric="cosine")  # Pinecone
db.create_collection(collection_name="docs")     # Weaviate, Chroma, Milvus, Qdrant

Document upsert

db.upsert(documents, namespace="tenant_1")  # All backends

Query

results = db.query(
    vector=embedding,
    top_k=10,
    filters={"category": "tech"}
)

Hybrid search

results = db.hybrid_search(
    query_embedding=dense_vector,
    query_sparse_embedding=sparse_vector,
    top_k=10
)
This consistency allows you to swap backends with minimal code changes, making it easy to benchmark different databases against your workload.

Build docs developers (and LLMs) love