Circuit Breaker Pattern
Stopping cascading failures by short-circuiting calls to a failing downstream service.
Overview
The Circuit Breaker pattern prevents cascading failures by stopping calls to a failing service after a threshold of failures is reached. It operates in three states: Closed (normal, calls pass through), Open (failure threshold exceeded, calls immediately return an error without attempting the network call), and Half-Open (a trial call is attempted to see if the service has recovered).
Origin
Michael Nygard described the pattern in "Release It!" (2007). Netflix's Hystrix library (2012) popularised it in microservices. Resilience4j replaced Hystrix (Netflix deprecated it in 2018). Envoy and service meshes implement it at the infrastructure level.
Examples
Circuit breaker implementation
class CircuitBreaker {
constructor(fn, { threshold = 5, timeout = 60_000, halfOpenCalls = 1 } = {}) {
this.fn = fn
this.threshold = threshold
this.timeout = timeout
this.halfOpenMax = halfOpenCalls
this.failures = 0
this.state = 'closed' // closed | open | half-open
this.lastFailureAt = null
this.halfOpenCount = 0
}
async call(...args) {
if (this.state === 'open') {
if (Date.now() - this.lastFailureAt > this.timeout) {
this.state = 'half-open'
this.halfOpenCount = 0
} else {
throw new Error('Circuit open, service unavailable')
}
}
try {
const result = await this.fn(...args)
if (this.state === 'half-open') this.reset()
return result
} catch (err) {
this.failures++
this.lastFailureAt = Date.now()
if (this.failures >= this.threshold || this.state === 'half-open') {
this.state = 'open'
}
throw err
}
}
reset() { this.failures = 0; this.state = 'closed' }
}
const breaker = new CircuitBreaker(paymentService.charge.bind(paymentService))
const result = await breaker.call(order.total)Use Cases
- 01Preventing a slow or failing downstream service from exhausting connection pools and thread pools
- 02Failing fast: return a cached or degraded response immediately rather than waiting for timeout
- 03Giving a recovering service breathing room: the Open state prevents traffic from overwhelming it during recovery
- 04Observability: circuit state changes are key operational signals for incident response
When Not to Use
- //Synchronous calls that cannot tolerate the failure state, a circuit breaker requires a fallback or degraded mode to be useful
- //If the underlying service failure is expected to be very brief (milliseconds), circuit breaking introduces unnecessary open states
Technical Notes
- The fallback is as important as the breaker: what does the system do when the circuit is open? Cache the last known response, return a default, or surface an explicit unavailability message
- Thresholds must be tuned per service. Too sensitive and transient errors open the circuit unnecessarily; too lenient and cascading failures still occur
- Bulkhead pattern complements circuit breaker: isolate connection pools per downstream service so one slow service cannot exhaust the shared pool
More in Architecture