Overview
Redis Cluster provides automatic sharding across multiple Redis nodes, enabling horizontal scaling and high availability without external coordination services.
┌─────────────────────────────────────────────────────┐
│ Redis Cluster (16384 slots) │
│ │
│ Node 1 (Master) Node 2 (Master) Node 3 (M) │
│ Slots: 0-5460 Slots: 5461-10922 10923-16383 │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Node 4 (Replica) Node 5 (Replica) Node 6 (R) │
│ (backup for N1) (backup for N2) (backup N3) │
└─────────────────────────────────────────────────────┘
Hash Slots
Slot Distribution
From cluster.c:29-61, Redis Cluster divides the key space into 16,384 slots:
#define CLUSTER_SLOTS 16384
int keyHashSlot(char *key, int keylen) {
int s, e; /* start-end indexes of { and } */
for (s = 0; s < keylen; s++)
if (key[s] == '{') break;
/* No '{' ? Hash the whole key. */
if (s == keylen) return crc16(key,keylen) & 0x3FFF;
/* '{' found. Look for '}' after it. */
for (e = s+1; e < keylen; e++)
if (key[e] == '}') break;
/* Empty '{}' ? Hash the whole key. */
if (e == keylen || e == s+1) return crc16(key,keylen) & 0x3FFF;
/* Hash the part between { } */
return crc16(key+s+1,e-s-1) & 0x3FFF;
}
Hash Slot Calculation:
slot = CRC16(key) mod 16384
16,384 slots were chosen as a balance between granularity and cluster metadata size. This number allows fine-grained distribution while keeping cluster bus messages reasonable.
From cluster.c:36-60, hash tags ensure related keys map to the same slot:
Examples:
{user:123}:profile → hash "user:123"
{user:123}:orders → hash "user:123"
user:456:profile → hash "user:456:profile"
{user}:789:profile → hash "user"
{}user:999:profile → hash entire key (empty tag)
Use hash tags for multi-key operations like MGET, MSET, or transactions that need atomic guarantees across multiple keys.
Cluster Architecture
Node Types
Master Nodes:
- Handle read and write operations
- Own a subset of hash slots (0-16383)
- Replicate data to replica nodes
Replica Nodes:
- Maintain copies of master’s data
- Serve read queries (if enabled)
- Promote to master on failure
Cluster Bus
Nodes communicate via a binary protocol on the cluster bus:
- Port:
client_port + 10000 (e.g., 6379 → 16379)
- Protocol: Binary gossip protocol
- Purpose: Node discovery, failure detection, configuration propagation
From redis.conf:276-277:
# Enable TLS on cluster bus
tls-cluster yes
Slot Assignment
Initial Setup
# Create cluster with 3 masters and 3 replicas
redis-cli --cluster create \
127.0.0.1:7000 127.0.0.1:7001 127.0.0.1:7002 \
127.0.0.1:7003 127.0.0.1:7004 127.0.0.1:7005 \
--cluster-replicas 1
Manual Assignment
# Assign slots to a node
CLUSTER ADDSLOTS 0 1 2 3 4 5 ...
# Assign range
for slot in {0..5460}; do
redis-cli -p 7000 CLUSTER ADDSLOTS $slot
done
Viewing Assignments
From cluster.c:963-972:
# View cluster topology
CLUSTER SHARDS
# View nodes and slots
CLUSTER NODES
# View slots for specific range
CLUSTER SLOTS
Sharding and Resharding
How Resharding Works
- Mark slot as migrating on source node
- Mark slot as importing on target node
- Move keys one by one from source to target
- Update slot assignment in cluster configuration
- Propagate new configuration to all nodes
Resharding Commands
# Prepare source node (moving FROM)
CLUSTER SETSLOT <slot> MIGRATING <target-node-id>
# Prepare target node (moving TO)
CLUSTER SETSLOT <slot> IMPORTING <source-node-id>
# Get keys in slot
CLUSTER GETKEYSINSLOT <slot> <count>
# Move key
MIGRATE <host> <port> <key> <dest-db> <timeout> REPLACE
# Complete migration
CLUSTER SETSLOT <slot> NODE <target-node-id>
Automated Resharding
# Automatic rebalancing
redis-cli --cluster rebalance 127.0.0.1:7000 \
--cluster-threshold 2 \
--cluster-use-empty-masters
# Reshard specific slots
redis-cli --cluster reshard 127.0.0.1:7000 \
--cluster-from <source-node-id> \
--cluster-to <target-node-id> \
--cluster-slots <count>
Resharding is online but can impact performance. During migration, keys in transitioning slots experience slight latency as the cluster determines their location.
Client Redirection
Redirection Types
From cluster.c:1161-1184:
#define CLUSTER_REDIR_NONE 0
#define CLUSTER_REDIR_CROSS_SLOT 1 // Keys in different slots
#define CLUSTER_REDIR_UNSTABLE 2 // Slot is migrating
#define CLUSTER_REDIR_DOWN_UNBOUND 3 // Slot not assigned
#define CLUSTER_REDIR_ASK 4 // Check target during migration
#define CLUSTER_REDIR_MOVED 5 // Slot permanently moved
MOVED Redirection
When a slot is reassigned:
Client: GET mykey
Node 1: -MOVED 3999 127.0.0.1:7001
Client: [connects to 127.0.0.1:7001]
Client: GET mykey
Node 2: "value"
ASK Redirection
During slot migration:
Client: GET mykey
Node 1: -ASK 3999 127.0.0.1:7001
Client: [connects to 127.0.0.1:7001]
Client: ASKING
Client: GET mykey
Node 2: "value"
ASK is temporary during migration. MOVED is permanent. Smart clients cache MOVED responses to avoid future redirections.
Multi-Key Operations
Same-Slot Requirement
From cluster.c:1133-1149:
int extractSlotFromKeysResult(robj **argv, getKeysResult *keys_result) {
if (keys_result->numkeys == 0 || !server.cluster_enabled)
return INVALID_CLUSTER_SLOT;
int first_slot = INVALID_CLUSTER_SLOT;
for (int j = 0; j < keys_result->numkeys; j++) {
int this_slot = keyHashSlot(argv[keys_result->keys[j].pos]->ptr,
sdslen(argv[keys_result->keys[j].pos]->ptr));
if (first_slot == INVALID_CLUSTER_SLOT)
first_slot = this_slot;
else if (first_slot != this_slot)
return CLUSTER_CROSSSLOT; // Error!
}
return first_slot;
}
Commands requiring same slot:
- MGET, MSET, DEL (with multiple keys)
- All commands in MULTI/EXEC transaction
- SUNION, SINTER, SDIFF
- Lua scripts accessing multiple keys
Solution: Use hash tags
# These keys are guaranteed same slot
MSET {user:123}:name "Alice" {user:123}:age 30 {user:123}:city "NYC"
# Transaction on same-slot keys
MULTI
INCR {counter}:page:views
INCR {counter}:page:unique
EXEC
Failure Detection and Failover
Failure Detection
Nodes detect failures through gossip protocol:
- Heartbeat: Nodes ping each other regularly
- PFAIL: Node marks unresponsive peer as “possibly failing”
- FAIL: When majority marks node as PFAIL, promoted to FAIL
- Propagation: FAIL state propagated via gossip
Automatic Failover
When master fails:
- Replica election: Replicas of failed master start election
- Vote collection: Other masters vote for best replica
- Promotion: Winning replica promotes itself to master
- Takeover: New master claims failed master’s slots
- Announcement: New configuration propagated
Selection criteria:
- Most recent replication offset (least data loss)
- Lower replica ID (tie-breaker)
Manual Failover
# Graceful failover (no data loss)
CLUSTER FAILOVER
# Force failover (may lose data)
CLUSTER FAILOVER FORCE
# Takeover (no voting, for automation)
CLUSTER FAILOVER TAKEOVER
Use CLUSTER FAILOVER for planned maintenance. It synchronizes replica before promoting, ensuring zero data loss.
Cluster Configuration
cluster-enabled
# Enable cluster mode
cluster-enabled yes
# Cluster configuration file (auto-generated)
cluster-config-file nodes-6379.conf
# Node timeout (milliseconds)
cluster-node-timeout 15000
cluster-require-full-coverage
# Reject queries if any slots uncovered
cluster-require-full-coverage yes
From cluster.c:1296-1299:
When yes: Cluster goes to FAIL state if any slot is unassigned
When no: Cluster serves requests for covered slots even if some are down
Setting cluster-require-full-coverage no allows partial availability but risks serving stale data if split-brain occurs.
cluster-replica-validity-factor
# Replica won't failover if too far behind
cluster-replica-validity-factor 10
Replica won’t participate in election if:
replication_lag > (node_timeout * replica_validity_factor) + repl_ping_time
Monitoring and Administration
Cluster Info
Output:
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_sent:12345
cluster_stats_messages_received:12345
Cluster Nodes
From cluster.c:1014-1019:
Output format:
<id> <ip:port@bus-port> <flags> <master> <ping> <pong> <epoch> <link> <slot>...
Example:
07c37dfeb235213a872192d90877d0cd55635b91 127.0.0.1:7000@17000 myself,master - 0 0 1 connected 0-5460
67ed2db8d677e59ec4a4cefb06858cf2a1a89fa1 127.0.0.1:7001@17001 master - 0 1426238316232 2 connected 5461-10922
Slot Statistics
From cluster.c:997:
Returns per-slot statistics:
- CPU usage per slot
- Network bytes per slot
- Memory usage per slot
Best Practices
Minimum Node Count
Recommended minimum: 6 nodes (3 masters + 3 replicas) for production. This ensures:
- High availability (can tolerate 1 master failure per shard)
- Proper quorum for split-brain protection
- Reasonable slot distribution
Key Design
Do:
- Use hash tags for related keys:
{user:123}:*
- Keep key names reasonably short
- Plan for even slot distribution
Don’t:
- Use random prefixes that prevent hash tag benefits
- Create hotspot keys (overwhelm single node)
- Use very long key names (increase memory usage)
Scaling Strategy
Scale out (add nodes):
# Add node to cluster
redis-cli --cluster add-node 127.0.0.1:7006 127.0.0.1:7000
# Rebalance slots
redis-cli --cluster rebalance 127.0.0.1:7000
Scale in (remove nodes):
# Reshard away from node
redis-cli --cluster reshard 127.0.0.1:7000 \
--cluster-from <node-id> \
--cluster-to <other-node-id> \
--cluster-slots <all-slots>
# Remove node
redis-cli --cluster del-node 127.0.0.1:7000 <node-id>
Backup Strategy
Cluster-aware backups:
# Backup each master
for node in master1 master2 master3; do
redis-cli -h $node BGSAVE
# Wait for completion
# Copy RDB file
done
Backup replicas instead of masters to avoid impacting production traffic. Replicas have identical data.
Troubleshooting
Check:
- Cluster bus port accessible (client port + 10000)
- All nodes have
cluster-enabled yes
- No conflicting
cluster-config-file
- Nodes can resolve each other’s IPs
CLUSTERDOWN Error
Causes:
- Some slots unassigned
cluster-require-full-coverage yes and node down
- Split-brain (network partition)
Fix:
# Check slot coverage
redis-cli --cluster check 127.0.0.1:7000
# Fix slot coverage
redis-cli --cluster fix 127.0.0.1:7000
Hot spots:
- Monitor
CLUSTER SLOT-STATS for uneven load
- Redesign key schema to distribute better
- Consider splitting hot keys across slots
Network latency:
- Reduce
cluster-node-timeout carefully
- Ensure cluster bus has adequate bandwidth
- Monitor gossip message rate