What is Backoff?
Backoff is the practice of waiting between retry attempts. Instead of immediately retrying a failed operation, backoff introduces a delay that gives the failing system time to recover.
Without backoff, rapid retries can:
Overwhelm struggling services : Making recovery harder or impossible
Waste resources : CPU and network bandwidth on futile attempts
Trigger rate limits : Aggressive retries can look like abuse
Amplify outages : Thundering herd problems when many clients retry simultaneously
Never retry without backoff in production systems. It can turn a small issue into a catastrophic outage.
Backoff Strategies in Resilience
Resilience supports two backoff strategies, defined in src/global.d.ts:15-17:
type BackoffStrategy =
| { type : "fixed" ; delayMs : number }
| { type : "exponential" ; baseDelayMs : number ; maxDelayMs : number ; jitter ?: boolean };
Fixed Backoff
Fixed backoff waits the same amount of time between each retry attempt.
When to use:
Simple, predictable retry patterns
When you know the recovery time of the downstream service
Low retry counts (1-3 retries)
Testing and debugging (easier to reason about)
Configuration:
import { withResilience } from '@oldwhisper/resilience' ;
const resilient = withResilience ( task , {
retries: 3 ,
backoff: {
type: 'fixed' ,
delayMs: 1000 // Wait 1 second between each retry
}
});
// Retry timeline:
// Attempt 1: immediate
// Attempt 2: after 1s
// Attempt 3: after 1s
// Attempt 4: after 1s
Implementation (from src/index.ts:72-74):
function computeBackoffMs ( strategy : Resilience . BackoffStrategy | undefined , attempt : number ) : number {
if ( ! strategy ) return 0 ;
if ( strategy . type === "fixed" ) return strategy . delayMs ;
// ...
}
Exponential Backoff
Exponential backoff doubles the wait time with each retry, up to a maximum delay.
When to use:
Production systems with high retry counts
Unknown or variable recovery times
Preventing thundering herd problems
Services that may need progressively longer recovery time
Configuration:
const resilient = withResilience ( task , {
retries: 5 ,
backoff: {
type: 'exponential' ,
baseDelayMs: 100 , // Start with 100ms
maxDelayMs: 10000 , // Cap at 10 seconds
jitter: true // Add randomization
}
});
// Retry timeline (without jitter):
// Attempt 1: immediate
// Attempt 2: after 100ms (100 * 2^0)
// Attempt 3: after 200ms (100 * 2^1)
// Attempt 4: after 400ms (100 * 2^2)
// Attempt 5: after 800ms (100 * 2^3)
// Attempt 6: after 1600ms (100 * 2^4)
Implementation (from src/index.ts:76-80):
const raw = strategy . baseDelayMs * Math . pow ( 2 , Math . max ( 0 , attempt - 1 ));
const capped = Math . min ( raw , strategy . maxDelayMs );
if ( ! strategy . jitter ) return capped ;
return Math . floor ( Math . random () * capped );
Key details:
Formula: baseDelayMs * 2^(attempt - 1)
Always capped at maxDelayMs to prevent excessive waits
Attempt counting starts at 1 (first retry is attempt 1)
Jitter: Breaking Synchronization
Jitter adds randomness to backoff delays. This is crucial for preventing thundering herd problems where many clients retry simultaneously.
The Thundering Herd Problem
Imagine 1000 clients all experience a failure at the same time:
Without jitter:
Time 0s: 1000 requests → all fail
Time 1s: 1000 retries → all fail (server still overloaded)
Time 3s: 1000 retries → all fail (synchronized retry storm)
Time 7s: 1000 retries → all fail (still synchronized)
With jitter:
Time 0s: 1000 requests → all fail
Time 0-1s: ~1000 retries spread over 1 second
Time 0-3s: ~1000 retries spread over 3 seconds
Time 0-7s: ~1000 retries spread over 7 seconds
Jitter Implementation
From src/index.ts:78-80:
if ( ! strategy . jitter ) return capped ;
return Math . floor ( Math . random () * capped );
When jitter is enabled, the delay is randomized between 0 and capped, creating a uniform distribution.
Example with jitter:
const resilient = withResilience ( task , {
retries: 4 ,
backoff: {
type: 'exponential' ,
baseDelayMs: 1000 ,
maxDelayMs: 16000 ,
jitter: true
}
});
// Retry delays (example - actual values are random):
// Attempt 2: 534ms (random between 0-1000ms)
// Attempt 3: 1847ms (random between 0-2000ms)
// Attempt 4: 2103ms (random between 0-4000ms)
// Attempt 5: 6891ms (random between 0-8000ms)
Always enable jitter in production for exponential backoff. It significantly reduces load spikes during outages.
Backoff in the Retry Loop
Backoff is applied after a retry is decided but before the next attempt (from src/index.ts:157-162):
const shouldRetry = attempt <= retries && retryOn ( err );
if ( ! shouldRetry ) throw err ;
const waitMs = computeBackoffMs ( config . backoff , attempt );
hooks ?. onRetry ?.({ name , attempt , delayMs: waitMs , error: err });
if ( waitMs > 0 ) await delay ( waitMs );
The delay helper is a simple promise-based sleep (from src/index.ts:86-88):
function delay ( ms : number ) {
return new Promise < void >(( resolve ) => setTimeout ( resolve , ms ));
}
Complete Examples
Fixed Backoff
Exponential with Jitter
No Backoff
import { withResilience } from '@oldwhisper/resilience' ;
const fetchData = async () => {
const response = await fetch ( 'https://api.example.com/data' );
if ( ! response . ok ) throw new Error ( `HTTP ${ response . status } ` );
return response . json ();
};
const resilient = withResilience ( fetchData , {
name: 'fetchData' ,
retries: 3 ,
backoff: {
type: 'fixed' ,
delayMs: 2000 // Wait 2 seconds between retries
},
hooks: {
onRetry : ({ attempt , delayMs }) => {
console . log ( `Retrying attempt ${ attempt + 1 } after ${ delayMs } ms delay` );
}
}
});
// Total possible time: up to 6 seconds of backoff (3 retries × 2s each)
// Plus the time for each attempt itself
await resilient ();
Choosing the Right Strategy
Use Fixed Backoff When:
You have 1-3 retries only
The service has predictable recovery time
You’re testing or debugging
Simplicity is more important than optimization
Use Exponential Backoff When:
You have 4+ retries
Recovery time is unknown or variable
You’re building production systems
You need to handle thundering herd scenarios
You want to progressively back off from a struggling service
For most production use cases, exponential backoff with jitter is the recommended approach.
Monitoring Backoff Behavior
Use hooks to track actual backoff delays:
const backoffMetrics = {
totalDelayMs: 0 ,
retryCount: 0
};
const resilient = withResilience ( task , {
retries: 5 ,
backoff: {
type: 'exponential' ,
baseDelayMs: 100 ,
maxDelayMs: 10000 ,
jitter: true
},
hooks: {
onRetry : ({ delayMs }) => {
backoffMetrics . totalDelayMs += delayMs ;
backoffMetrics . retryCount ++ ;
console . log ( `Cumulative backoff: ${ backoffMetrics . totalDelayMs } ms over ${ backoffMetrics . retryCount } retries` );
}
}
});
Best Practices
Always use backoff with retries : Never retry without at least a small delay
Enable jitter in production : Prevents synchronized retry storms
Set reasonable maximums : maxDelayMs prevents excessive wait times
Start small : Begin with short baseDelayMs (100-500ms)
Consider total time : Account for (retries × average_delay) + (retries × timeout) in your SLAs
Retries - The retry mechanism that backoff enhances
Timeouts - Each retry attempt can have its own timeout
Circuit Breakers - Stop retrying when failures become systemic