Skip to main content

Vector Databases

Vector databases specialized for similarity search, RAG (Retrieval-Augmented Generation) pipelines, and AI-powered applications.

Available Services

Qdrant

Port: 6333 (REST), 6334 (gRPC) | Memory: 512 MB | Maturity: StableHigh-performance vector similarity search engine for building RAG pipelines, semantic search, and AI-powered recommendation systems.Features:
  • Fast vector search
  • Filtering and payload
  • HNSW algorithm
  • Quantization support
  • Distributed mode
  • Rust-based performance
OpenClaw Integration:
  • Skill: qdrant-memory
  • Environment: QDRANT_HOST, QDRANT_PORT
Recommends: RedisDocumentation

ChromaDB

Port: 8100 | Memory: 512 MB | Maturity: StableOpen-source AI-native vector database with simple APIs for storing, searching, and filtering vectors.Features:
  • Easy-to-use API
  • Multiple embedding models
  • Metadata filtering
  • Auto-embedding
  • Python and JavaScript clients
  • Lightweight
OpenClaw Integration:
  • Environment: CHROMADB_HOST, CHROMADB_PORT
Documentation

Milvus

Port: 19530 (API), 9091 (Metrics) | Memory: 2048 MB | Maturity: StableOpen-source vector database built for scalable similarity search and AI applications.Features:
  • Billion-scale vectors
  • Hybrid search
  • Multiple index types
  • GPU acceleration
  • Kubernetes-ready
  • Cloud-native architecture
OpenClaw Integration:
  • Environment: MILVUS_URI
Documentation

Weaviate

Port: 8082 (REST), 50051 (gRPC) | Memory: 1024 MB | Maturity: StableCloud-native vector database with built-in vectorization modules, hybrid search, and GraphQL API.Features:
  • GraphQL API
  • Built-in vectorizers
  • Hybrid search (vector + keyword)
  • Multi-tenancy
  • Replication
  • Schema-based
OpenClaw Integration:
  • Environment: WEAVIATE_HOST, WEAVIATE_PORT
Documentation

Usage Examples

RAG Pipeline Stack

npx create-better-openclaw \
  --services qdrant,ollama,open-webui \
  --yes

Research Agent Preset

npx create-better-openclaw --preset researcher --yes
This includes: Qdrant, SearXNG, Browserless, Redis

Knowledge Base Stack

npx create-better-openclaw \
  --services qdrant,postgresql,meilisearch \
  --yes

Vector Database Comparison

DatabasePerformanceScalabilityAPI StyleHybrid SearchMemory
QdrantExcellentGoodREST/gRPC512 MB
ChromaDBGoodModerateREST512 MB
MilvusExcellentExcellentREST/gRPC2048 MB
WeaviateExcellentExcellentGraphQL/REST1024 MB

RAG Architecture Patterns

Basic RAG

1. Document → Embedding Model → Vector DB
2. Query → Embedding Model → Vector Search
3. Retrieved Context + Query → LLM → Response

Advanced RAG with Reranking

1. Document → Chunking → Embedding → Vector DB
2. Query → Vector Search (top 50)
3. Reranking (top 5)
4. Context + Query → LLM → Response

Multi-Modal RAG

1. Text + Images → Embeddings → Vector DB
2. Query → Multi-modal Search
3. Retrieved Content → Multi-modal LLM → Response

Embedding Models

ModelDimensionsUse CaseProvider
text-embedding-3-small1536General purposeOpenAI
text-embedding-3-large3072High accuracyOpenAI
all-MiniLM-L6-v2384Fast, localSentence Transformers
BAAI/bge-large-en1024English textOpen source
intfloat/e5-large1024Multi-lingualOpen source

Local Embedding with Ollama

# Pull embedding model
docker exec ollama ollama pull mxbai-embed-large

# Use in your application
curl http://localhost:11434/api/embeddings \
  -d '{"model": "mxbai-embed-large", "prompt": "Your text here"}'

Collection Management

Qdrant Collections

from qdrant_client import QdrantClient

client = QdrantClient(url="http://localhost:6333")

# Create collection
client.create_collection(
    collection_name="documents",
    vectors_config={"size": 384, "distance": "Cosine"}
)

# Insert vectors
client.upsert(
    collection_name="documents",
    points=[{
        "id": 1,
        "vector": embedding,
        "payload": {"text": "Document content"}
    }]
)

# Search
results = client.search(
    collection_name="documents",
    query_vector=query_embedding,
    limit=5
)

ChromaDB Collections

import chromadb

client = chromadb.HttpClient(host="localhost", port=8100)

# Create collection
collection = client.create_collection(name="documents")

# Add documents (auto-embedding)
collection.add(
    documents=["Document 1", "Document 2"],
    ids=["id1", "id2"],
    metadatas=[{"source": "web"}, {"source": "pdf"}]
)

# Query
results = collection.query(
    query_texts=["search query"],
    n_results=5
)

Optimization Tips

Qdrant Optimization

  1. Index Type: Use HNSW for speed, quantization for memory
  2. Payload: Store minimal metadata for better performance
  3. Filtering: Use indexed payload fields for fast filtering
  4. Batch Operations: Insert vectors in batches
  5. Memory: Allocate sufficient RAM for index

ChromaDB Optimization

  1. Embedding Function: Choose appropriate embedding model
  2. Distance Metric: Use cosine similarity for most cases
  3. Persistence: Enable persistence for production
  4. Batch Size: Process documents in batches
  5. Metadata: Keep metadata small and indexed

Milvus Optimization

  1. Index Selection: Choose IVF_FLAT, IVF_SQ8, or HNSW
  2. Segmentation: Configure segment size appropriately
  3. Resource Groups: Allocate resources per workload
  4. GPU Acceleration: Use GPU for large-scale search
  5. Sharding: Distribute data across shards

Use Cases

# Search documents by meaning, not keywords
# Example: "python web framework" finds Flask, Django, FastAPI

Question Answering

# Retrieve relevant context from knowledge base
# Pass context to LLM for accurate answers

Recommendation Systems

# Find similar products, articles, or content
# Based on embedding similarity

Document Chat

# Upload documents → Chunk → Embed → Store
# Chat interface retrieves relevant chunks for LLM
# Store image embeddings (CLIP, etc.)
# Search by text or image similarity

Integration Examples

Qdrant + Ollama + Open WebUI

npx create-better-openclaw \
  --services qdrant,ollama,open-webui,redis \
  --yes

ChromaDB + Dify

npx create-better-openclaw \
  --services chromadb,dify,postgresql,redis \
  --yes

Milvus + LiteLLM + Flowise

npx create-better-openclaw \
  --services milvus,litellm,flowise \
  --yes

Monitoring and Maintenance

Health Checks

# Qdrant
curl http://localhost:6333/healthz

# ChromaDB
curl http://localhost:8100/api/v1/heartbeat

# Milvus
curl http://localhost:9091/healthz

# Weaviate
curl http://localhost:8082/v1/.well-known/ready

Metrics

# Qdrant metrics
curl http://localhost:6333/metrics

# Milvus metrics (Prometheus format)
curl http://localhost:9091/metrics

Backups

# Qdrant snapshot
curl -X POST http://localhost:6333/collections/documents/snapshots

# Copy data volumes
docker cp qdrant:/qdrant/storage ./qdrant-backup

Performance Benchmarks

Query Latency (approximate)

Database1K vectors100K vectors1M vectors
Qdrant<1ms1-5ms5-20ms
ChromaDB<1ms5-10ms20-50ms
Milvus<1ms1-5ms5-15ms
Weaviate<1ms5-10ms10-30ms

Throughput (queries/sec)

DatabaseSingle NodeDistributed
Qdrant1000+10000+
ChromaDB500+N/A
Milvus2000+20000+
Weaviate1000+10000+
Note: Performance varies based on vector dimensions, index type, and hardware.

Build docs developers (and LLMs) love