Benchmarking and Load Testing

Overview

Benchmarking helps you understand CockroachDB performance characteristics, capacity planning, and regression testing. This guide covers built-in workload tools and best practices.

Workload Tool

CockroachDB includes a workload CLI tool with several standard benchmark scenarios. The tool is located at pkg/workload in the source tree.

Installation

# Build the workload binary
make bin/workload

# Or use pre-built binary from releases
wget https://binaries.cockroachdb.com/cockroach-workload-latest.linux-amd64
chmod +x cockroach-workload-latest.linux-amd64
mv cockroach-workload-latest.linux-amd64 /usr/local/bin/workload

Built-in Workloads

Bank Workload

Simulates a simple banking application with account transfers.

# Create schema and load data
workload init bank \
  --rows=10000 \
  --payload-bytes=100 \
  --ranges=10 \
  'postgresql://root@localhost:26257?sslmode=disable'

Key parameters:

--rows: Number of accounts
--payload-bytes: Size of payload column
--ranges: Number of initial range splits
--concurrency: Number of concurrent workers
--max-rate: Maximum operations per second (0 = unlimited)

Schema:

CREATE TABLE bank (
  id INT PRIMARY KEY,
  balance INT,
  payload STRING,
  FAMILY (id, balance, payload)
);

KV Workload

Generic key-value operations for read/write performance testing.

workload init kv \
  --splits=10 \
  --cycle-length=10000 \
  'postgresql://root@localhost:26257?sslmode=disable'

workload run kv \
  --duration=10m \
  --concurrency=256 \
  --read-percent=95 \
  --batch=100 \
  'postgresql://root@localhost:26257?sslmode=disable'

Key parameters:

--read-percent: Percentage of read operations (0-100)
--batch: Number of operations per SQL statement
--zipfian: Use zipfian distribution (some keys accessed more frequently)
--sequential: Sequential key access pattern
--splits: Number of range splits

TPCC Workload

Industry-standard TPC-C benchmark for OLTP performance.

TPCC initialization can take significant time. For 1000 warehouses, expect 30-60 minutes.

# Initialize with 100 warehouses
workload init tpcc \
  --warehouses=100 \
  --data-loader=import \
  'postgresql://root@localhost:26257/tpcc?sslmode=disable'

Key parameters:

--warehouses: Number of warehouses (scale factor)
--active-warehouses: Warehouses actively used during run
--workers: Number of concurrent workers
--wait: Include think time between operations
--ramp: Ramp-up period before full load

Transaction mix:

New Order: 45%
Payment: 43%
Order Status: 4%
Delivery: 4%
Stock Level: 4%

YCSB Workload

Yahoo! Cloud Serving Benchmark for various access patterns.

# 50% reads, 50% updates
workload init ycsb \
  --initial-rows=1000000 \
  'postgresql://root@localhost:26257?sslmode=disable'

workload run ycsb \
  --workload=A \
  --concurrency=256 \
  --duration=10m \
  'postgresql://root@localhost:26257?sslmode=disable'

Workload types:

A: Update heavy (50% read, 50% update)
B: Read mostly (95% read, 5% update)
C: Read only (100% read)
D: Read latest (95% read, 5% insert, zipfian)
E: Scan short ranges (95% scan, 5% insert)
F: Read-modify-write (50% read, 50% read-modify-write)

Benchmarking Best Practices

Test Environment Setup

Use production-like hardware

Match CPU, memory, disk, and network specifications.

Configure cluster settings

Set replication factor, locality, and resource limits to match production.

Warm up the cluster

Run workload for 5-10 minutes before measuring to populate caches.

Run multiple iterations

Execute benchmark 3-5 times and report median/average results.

Monitor throughout test

Track CPU, memory, disk I/O, and network metrics.

Ramp-Up Period

Always include a ramp-up period to reach steady state:

workload run kv \
  --ramp=2m \
  --duration=10m \
  --concurrency=256 \
  'postgresql://root@localhost:26257?sslmode=disable'

During ramp-up:

Connections are established gradually
Caches warm up
Range splits stabilize
Metrics are not included in final results

Concurrency and Rate Limiting

Adjust concurrency and rate limits based on your use case:

Latency testing: Low concurrency (1-10), measure minimum latency
Throughput testing: High concurrency (100-1000), measure maximum QPS
Realistic simulation: Match production concurrency and rate

# Latency benchmark: measure best-case latency
workload run kv \
  --concurrency=1 \
  --read-percent=100 \
  --duration=5m

# Throughput benchmark: find maximum QPS
workload run kv \
  --concurrency=512 \
  --max-rate=0 \
  --duration=10m

# Rate-limited: simulate production load
workload run kv \
  --concurrency=256 \
  --max-rate=10000 \
  --duration=30m

Metrics and Analysis

Output Format

Workload tool outputs periodic statistics:

0s        0           64.0 ops/s     15.6 ms/op     14.3 ms p50     16.8 ms p95     25.2 ms p99
0s        0          128.3 ops/s     15.4 ms/op     14.1 ms p50     17.2 ms p95     23.1 ms p99
0s        0          127.1 ops/s     15.8 ms/op     14.5 ms p50     17.9 ms p95     24.6 ms p99

Columns:

Time: Elapsed time
Errors: Error count
ops/s: Operations per second (throughput)
ms/op: Average latency per operation
p50/p95/p99: Latency percentiles

Histogram Export

Export detailed latency histograms for analysis:

workload run kv \
  --duration=10m \
  --histograms=/tmp/kv-histogram.json \
  --histogram-export-format=json \
  'postgresql://root@localhost:26257?sslmode=disable'

Formats available:

json: HdrHistogram JSON format
openmetrics: Prometheus-compatible metrics

Prometheus Integration

Expose metrics for Prometheus scraping:

workload run kv \
  --prometheus-port=2112 \
  --duration=30m \
  'postgresql://root@localhost:26257?sslmode=disable'

Access metrics at http://localhost:2112/metrics.

Custom Workloads

Create custom workloads for application-specific testing.

Using pgbench

Pgbench is PostgreSQL’s built-in benchmark tool, compatible with CockroachDB:

# Initialize pgbench schema
pgbench -i -s 100 'postgresql://root@localhost:26257/bench?sslmode=disable'

# Run TPC-B-like benchmark
pgbench -c 50 -j 10 -T 300 'postgresql://root@localhost:26257/bench?sslmode=disable'

# Custom SQL script
pgbench -c 50 -j 10 -T 300 -f custom_queries.sql \
  'postgresql://root@localhost:26257/bench?sslmode=disable'

Parameters:

-i: Initialize (create schema and data)
-s: Scale factor (number of rows)
-c: Number of clients
-j: Number of threads
-T: Duration in seconds
-f: Custom SQL script file

Custom SQL Scripts

Example custom_queries.sql:

-- Random read
\set id random(1, 1000000)
SELECT * FROM accounts WHERE id = :id;

-- Random update
\set id random(1, 1000000)
\set amount random(-1000, 1000)
UPDATE accounts SET balance = balance + :amount WHERE id = :id;

-- Range scan
\set start_id random(1, 999000)
SELECT * FROM accounts WHERE id BETWEEN :start_id AND :start_id + 1000;

Capacity Planning

Sizing Methodology

Define requirements

Target QPS, latency percentiles, data volume, growth rate.

Run baseline benchmark

Test with representative workload on small cluster.

Scale incrementally

Add nodes and observe throughput scaling.

Find bottlenecks

Identify limiting resource: CPU, disk I/O, network, memory.

Calculate capacity

Project required cluster size for target load with headroom.

Scaling Tests

Test horizontal scalability:

# Benchmark with 3 nodes
workload run kv --duration=10m --concurrency=256 \
  'postgresql://root@node1:26257,node2:26257,node3:26257?sslmode=disable'

# Add nodes and re-test with 6 nodes
workload run kv --duration=10m --concurrency=256 \
  'postgresql://root@node1:26257,...,node6:26257?sslmode=disable'

# Compare throughput scaling

Resource Utilization

Monitor resource usage during benchmark:

# CPU usage per node
SELECT node_id, cpu_percent FROM crdb_internal.gossip_liveness;

# Memory usage
SELECT node_id, used_capacity_bytes, available_capacity_bytes 
FROM crdb_internal.kv_node_status;

# Disk I/O
# Monitor using OS tools: iostat, iotop, etc.

Regression Testing

Continuous Benchmarking

Automated benchmarks to detect performance regressions:

#!/bin/bash
# benchmark.sh - Run nightly performance tests

set -e

# Configuration
DURATION="30m"
CONCURRENCY="256"
OUTPUT_DIR="/var/benchmarks/$(date +%Y%m%d)"

mkdir -p "$OUTPUT_DIR"

# Run benchmarks
for workload in kv ycsb bank; do
  echo "Running $workload benchmark..."
  
  workload run "$workload" \
    --duration="$DURATION" \
    --concurrency="$CONCURRENCY" \
    --histograms="$OUTPUT_DIR/${workload}-histogram.json" \
    'postgresql://root@localhost:26257?sslmode=disable' \
    > "$OUTPUT_DIR/${workload}-output.log" 2>&1
done

# Compare with baseline
python compare_benchmarks.py "$OUTPUT_DIR" /var/benchmarks/baseline

Comparing Results

Key metrics to track:

Performance regression indicators:

Throughput decrease: >10% drop in ops/sec
Latency increase: >20% increase in P99
Error rate increase: Any non-zero error rate
Resource usage spike: >30% increase in CPU/memory
Replication lag: >1s behind on followers

Best Practices Summary

Benchmarking checklist:

✓ Use production-equivalent hardware
✓ Include ramp-up period (1-5 minutes)
✓ Run for sufficient duration (10+ minutes)
✓ Test multiple concurrency levels
✓ Monitor all nodes during test
✓ Record cluster configuration
✓ Export histograms for detailed analysis
✓ Run multiple iterations
✓ Test read and write workloads separately
✓ Document results and environment

Getting Started

Architecture

SQL Reference

Administration

Operations

Performance

Benchmarking and Load Testing

Overview

Workload Tool

Installation

Built-in Workloads

Bank Workload

KV Workload

TPCC Workload

YCSB Workload

Benchmarking Best Practices

Test Environment Setup

Ramp-Up Period

Concurrency and Rate Limiting

Metrics and Analysis

Output Format

Histogram Export

Prometheus Integration

Custom Workloads

Using pgbench

Custom SQL Scripts

Capacity Planning

Sizing Methodology

Scaling Tests

Resource Utilization

Regression Testing

Continuous Benchmarking

Comparing Results

Best Practices Summary

Additional Resources

Build docs developers (and LLMs) love

Getting Started

Architecture

SQL Reference

Administration

Operations

Performance

​Overview

​Workload Tool

​Installation

​Built-in Workloads

​Bank Workload

​KV Workload

​TPCC Workload

​YCSB Workload

​Benchmarking Best Practices

​Test Environment Setup

​Ramp-Up Period

​Concurrency and Rate Limiting

​Metrics and Analysis

​Output Format

​Histogram Export

​Prometheus Integration

​Custom Workloads

​Using pgbench

​Custom SQL Scripts

​Capacity Planning

​Sizing Methodology

​Scaling Tests

​Resource Utilization

​Regression Testing

​Continuous Benchmarking

​Comparing Results

​Best Practices Summary

​Additional Resources

Build docs developers (and LLMs) love

Overview

Workload Tool

Installation

Built-in Workloads

Bank Workload

KV Workload

TPCC Workload

YCSB Workload

Benchmarking Best Practices

Test Environment Setup

Ramp-Up Period

Concurrency and Rate Limiting

Metrics and Analysis

Output Format

Histogram Export

Prometheus Integration

Custom Workloads

Using pgbench

Custom SQL Scripts

Capacity Planning

Sizing Methodology

Scaling Tests

Resource Utilization

Regression Testing

Continuous Benchmarking

Comparing Results

Best Practices Summary

Additional Resources