Skip to main content

Overview

Caching exists at every layer of the stack: browser, CDN, reverse proxy, application (Redis), and database buffer pool. Each layer has different hit rate, invalidation complexity, and latency characteristics.
A cache hit at any layer prevents all lower layers from serving the request. Design caching strategies from the user outward: browser → CDN → reverse proxy → application → database.

Cache Hierarchy

// Request cache waterfall
1. Browser cache:     hit if within max-age
2. CDN PoP:          hit if within s-maxage
3. Reverse proxy:    Varnish full-page cache
4. Redis:            application key-value lookup
5. DB read replica:  indexed query
6. DB primary:       full query + write path

// Each hit prevents all lower layers from serving

Latency by Layer

LayerTypical LatencyCapacityInvalidation
Browser0ms (instant)~50-100MBCache-Control headers
CDN10-50msTBs per PoPAPI call or TTL
Reverse Proxy1-5msGBsManual purge or TTL
Application (Redis)1-3ms100s of GBsExplicit DEL command
Database Buffer0.1msRAM-limitedAutomatic (LRU)
Measure cache hit rate per layer in production. A namespace with less than 50% hit rate is either poorly keyed, has TTLs too short, or is caching data that changes too frequently to benefit.

Caching Strategies

Pattern: Application controls reads; load from DB on miss
// Cache-Aside read path
async function getUser(id) {
  // 1. Try cache
  const hit = await redis.get(`user:${id}`);
  if (hit) return JSON.parse(hit);
  
  // 2. Cache miss: load from DB
  const user = await db.get(id);
  
  // 3. Populate cache with TTL + jitter
  const ttl = 300 + Math.floor(Math.random() * 30);
  await redis.setex(`user:${id}`, ttl, JSON.stringify(user));
  
  return user;
}

// Write path: invalidate cache
async function updateUser(id, data) {
  await db.update(id, data);
  await redis.del(`user:${id}`);  // invalidate
}
Pros:
  • Simple and most common
  • Cache failure doesn’t break the system
  • Only requested data is cached
Cons:
  • Cache miss adds latency (DB round trip)
  • Thundering herd on cold cache
  • Stale data until invalidation
Cache-Aside is the default choice for most applications. Only use Write-Through or Write-Behind when you have specific requirements and understand the trade-offs.

Cache Invalidation Patterns

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton

TTL-Based Expiration

Simplest strategy: set expiration time and accept bounded staleness:
// TTL with jitter to prevent thundering herd
const baseTTL = 300;  // 5 minutes
const jitter = Math.floor(Math.random() * 30);
await redis.setex(key, baseTTL + jitter, value);
TTL jitter prevents synchronized mass expiry. Without jitter, all cached items set at the same time expire simultaneously, causing a thundering herd to the database.

Explicit Invalidation

Delete cache entries when data changes:
// Single key invalidation
await redis.del(`user:${userId}`);

// Pattern-based invalidation
await redis.eval(`
  local keys = redis.call('keys', ARGV[1])
  for i=1,#keys,5000 do
    redis.call('del', unpack(keys, i, math.min(i+4999, #keys)))
  end
  return #keys
`, 0, 'user:*');

// Tagged invalidation (using Redis Sets)
await redis.sadd(`tag:orders:user:${userId}`, `order:${orderId}`);

// Invalidate all orders for a user
const keys = await redis.smembers(`tag:orders:user:${userId}`);
await redis.del(...keys);
await redis.del(`tag:orders:user:${userId}`);
KEYS pattern matching in Redis blocks the server. Use SCAN for production, or maintain explicit tag sets for bulk invalidation.

Cache Stampede Prevention

When cache expires, multiple requests may simultaneously query the database:
// Distributed lock prevents stampede
async function getWithStampedeProtection(key) {
  // 1. Try cache
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);
  
  // 2. Acquire lock (only one request proceeds)
  const lockKey = `lock:${key}`;
  const lockAcquired = await redis.set(
    lockKey, 
    'locked', 
    'EX', 10,  // 10 sec expiry
    'NX'       // only if not exists
  );
  
  if (lockAcquired) {
    try {
      // 3. Load from DB (only this request)
      const value = await db.get(key);
      await redis.setex(key, 300, JSON.stringify(value));
      return value;
    } finally {
      await redis.del(lockKey);
    }
  } else {
    // 4. Lock held by another request: wait and retry
    await sleep(100);
    return getWithStampedeProtection(key);
  }
}

Probabilistic Early Expiration

Refresh before expiry based on probability:
function shouldRefresh(ttl, beta = 1.0) {
  // XFetch algorithm: probabilistic early refresh
  const delta = Date.now() - startTime;
  return delta * beta * Math.log(Math.random()) >= ttl;
}

async function getUser(id) {
  const cached = await redis.get(`user:${id}`);
  
  if (cached) {
    const { value, ttl, startTime } = JSON.parse(cached);
    
    // Probabilistically refresh before expiry
    if (shouldRefresh(ttl, startTime)) {
      // Async refresh in background
      refreshUser(id).catch(err => logger.error(err));
    }
    
    return value;
  }
  
  // Cache miss: blocking load
  return loadUser(id);
}

HTTP Caching Headers

Cache-Control Directives

GET /api/user/123
Cache-Control: public, s-maxage=86400, max-age=3600, stale-while-revalidate=60
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"
Last-Modified: Wed, 21 Oct 2023 07:28:00 GMT
// Layered caching: CDN vs browser
Cache-Control: public, s-maxage=86400, max-age=3600
// CDN caches 24h; browser caches 1h

// Private user data: no CDN caching
Cache-Control: private, max-age=300

// Never cache (authentication, checkout)
Cache-Control: private, no-store, no-cache, must-revalidate

// Immutable assets (versioned URLs)
Cache-Control: public, max-age=31536000, immutable
Directives:
  • public: cacheable by CDN and browser
  • private: cacheable only by browser
  • no-store: do not cache at all
  • no-cache: must revalidate (check ETag)
  • max-age: browser TTL (seconds)
  • s-maxage: CDN TTL (seconds)
  • immutable: never revalidate (versioned assets)
  • stale-while-revalidate: serve stale while fetching fresh

Versioned Assets for Long TTLs

<!-- BAD: must revalidate every time -->
<script src="/js/app.js"></script>
<link rel="stylesheet" href="/css/main.css">

<!-- GOOD: 1-year TTL + instant "invalidation" via new filename -->
<script src="/js/app.v2.5.3.min.js"></script>
<link rel="stylesheet" href="/css/main.a3f8d9c.min.css">
Cache-Control: public, max-age=31536000, immutable
// 1 year TTL: file never changes
// "Invalidate" by deploying new filename
Use versioned filenames (content hash or semver) with 1-year TTLs for static assets. This achieves instant cache “invalidation” without waiting for CDN TTL expiry.

Redis Patterns

Data Structure Selection

# Simple key-value
SET user:1001 '{"name":"Alice","email":"[email protected]"}'
GET user:1001

# Atomic increment
INCR pageviews:post:42
INCRBY rate_limit:user:1001 1

# Set with expiry
SETEX session:abc123 3600 '{"userId":1001}'

Redis Eviction Policies

# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
PolicyBehavior
noevictionReturn error when memory full (default)
allkeys-lruEvict least recently used keys
allkeys-lfuEvict least frequently used keys
volatile-lruEvict LRU among keys with TTL
volatile-ttlEvict keys with shortest TTL
allkeys-randomEvict random keys
Use allkeys-lru for cache workloads where all keys are candidates for eviction. Use volatile-lru when you have mixed data (persistent + cache) and only cache keys have TTLs.

Monitoring Cache Health

Key Metrics

// Application-level metrics
const cacheHits = redis.get('stats:cache:hits');
const cacheMisses = redis.get('stats:cache:misses');
const hitRate = cacheHits / (cacheHits + cacheMisses);

if (hitRate < 0.5) {
  alert('Cache hit rate below 50% — investigate key TTLs or access patterns');
}
# Redis INFO stats
redis-cli INFO stats

# Key metrics:
keyspace_hits:842765
keyspace_misses:103450
evicted_keys:1532
expired_keys:45123

# Hit rate calculation
Hit Rate = keyspace_hits / (keyspace_hits + keyspace_misses)
         = 842765 / (842765 + 103450)
         = 89.1% healthy

Redis Slow Log

# Configure slow log threshold
CONFIG SET slowlog-log-slower-than 10000  # 10ms
CONFIG SET slowlog-max-len 128

# View slow queries
SLOWLOG GET 10
KEYS * pattern matching blocks Redis. Use SCAN for production environments:
# BAD: blocks server
KEYS user:*

# GOOD: iterative scan
SCAN 0 MATCH user:* COUNT 100

Best Practices

  • Never cache without expiration (except explicitly persistent data)
  • Add jitter to TTLs: ttl = base + random(0, jitter)
  • Shorter TTLs for frequently changing data, longer for stable data
  • Alert when hit rate drops below 50-60%
  • Low hit rate indicates poor key design or inappropriate caching
  • Use Redis INFO or application-level instrumentation
  • Timestamps, request IDs, nonces have 0% hit rate
  • Caching adds latency without benefit
  • Cache only data with read:write ratio > 3:1
  • Network transfer latency negates cache benefit
  • Fragments memory, causes evictions
  • Store reference (S3 key) instead of full blob

Next Steps

Databases

Optimize database queries to reduce cache dependency

Load Balancing

Use consistent hashing for cache tier load balancing

Scalability

Learn how caching enables horizontal scaling

Availability

Design for cache failure scenarios and graceful degradation

Build docs developers (and LLMs) love