Architecture

Circuit Breaker Pattern

Stopping cascading failures by short-circuiting calls to a failing downstream service.

Overview

The Circuit Breaker pattern prevents cascading failures by stopping calls to a failing service after a threshold of failures is reached. It operates in three states: Closed (normal, calls pass through), Open (failure threshold exceeded, calls immediately return an error without attempting the network call), and Half-Open (a trial call is attempted to see if the service has recovered).

Origin

Michael Nygard described the pattern in "Release It!" (2007). Netflix's Hystrix library (2012) popularised it in microservices. Resilience4j replaced Hystrix (Netflix deprecated it in 2018). Envoy and service meshes implement it at the infrastructure level.

Examples

Circuit breaker implementation

class CircuitBreaker {
  constructor(fn, { threshold = 5, timeout = 60_000, halfOpenCalls = 1 } = {}) {
    this.fn           = fn
    this.threshold    = threshold
    this.timeout      = timeout
    this.halfOpenMax  = halfOpenCalls
    this.failures     = 0
    this.state        = 'closed'      // closed | open | half-open
    this.lastFailureAt = null
    this.halfOpenCount = 0
  }

  async call(...args) {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailureAt > this.timeout) {
        this.state = 'half-open'
        this.halfOpenCount = 0
      } else {
        throw new Error('Circuit open, service unavailable')
      }
    }

    try {
      const result = await this.fn(...args)
      if (this.state === 'half-open') this.reset()
      return result
    } catch (err) {
      this.failures++
      this.lastFailureAt = Date.now()
      if (this.failures >= this.threshold || this.state === 'half-open') {
        this.state = 'open'
      }
      throw err
    }
  }

  reset() { this.failures = 0; this.state = 'closed' }
}

const breaker = new CircuitBreaker(paymentService.charge.bind(paymentService))
const result  = await breaker.call(order.total)

Use Cases

  • 01Preventing a slow or failing downstream service from exhausting connection pools and thread pools
  • 02Failing fast: return a cached or degraded response immediately rather than waiting for timeout
  • 03Giving a recovering service breathing room: the Open state prevents traffic from overwhelming it during recovery
  • 04Observability: circuit state changes are key operational signals for incident response

When Not to Use

  • //Synchronous calls that cannot tolerate the failure state, a circuit breaker requires a fallback or degraded mode to be useful
  • //If the underlying service failure is expected to be very brief (milliseconds), circuit breaking introduces unnecessary open states

Technical Notes

  • The fallback is as important as the breaker: what does the system do when the circuit is open? Cache the last known response, return a default, or surface an explicit unavailability message
  • Thresholds must be tuned per service. Too sensitive and transient errors open the circuit unnecessarily; too lenient and cascading failures still occur
  • Bulkhead pattern complements circuit breaker: isolate connection pools per downstream service so one slow service cannot exhaust the shared pool