Node Circuit Breaker
by Ben Nadelbennadel (on Google+googleplus)This is a Node.js implementation of the Circuit Breaker pattern as popularized in Michael T. Nygard's book - Release It!release-it. The Circuit Breaker is intended to proxy the consumption of upstream resources such that failures in the upstream resource propagate to the current system in a predictable manner. To be clear, the Circuit Breaker doesn't prevent failures; rather, it helps your application manage failures proactively, failing fast and / or providing fallback values when applicable.
The Circuit Breaker proxies the consumption of upstream resources; but, it does not have intimate knowledge of the upstream resource. As such, the scope of the Circuit Breaker can be as course or as granular as you think is appropriate. For example, you can have one Circuit Breaker that represents an entire upstream resource. Or, you can create an individual Circuit Breaker for each method in an upstream resource. The more granular your Circuit Breakers, the less likely you are to get false positives.
Default Usage
Each Circuit Breaker is a composition of several objects that work together to provide the tracking and the fail-fast functionality. Fortunately, you don't have to know about this unless you are building custom implementations. All you have to do is ask the Circuit Breaker Factory for an instance with the given settings.The easiest way to create a Circuit Breaker is to create one with no settings at all. Doing so will create a Circuit Breaker with "good" defaults:
var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.create();
// Invoke as closure.
circuitBreaker.execute(
function() {
return( upstreamResource.load() );
}
);
// Invoke as closure with context and arguments.
circuitBreaker.executeInContext(
upstreamResource,
function( param1, param2 ) {
return( this.load( param1, param2 ) );
},
[ "arg1", "arg2" ]
);
// Invoke as method on an object.
circuitBreaker.executeMethod( upstreamResource, "load", [ "arg1", "arg2" ] );
As you can see, there are three ways to run commands through a Circuit Breaker:
execute( command [, fallback ] )
executeInContext( context, command [, args [, fallback ] ] )
executeMethod( context, methodName [, args [, fallback ] ] )
Each
execute*
method returns a Promise that will be fulfilled in resolution if the
execution was successful; or, fulfilled in rejection if the execution threw an error (or
was bypassed based on the state of the Circuit Breaker). The underlying method / function
that is being invoked should return a Promise or a synchronous value. Or, it can omit a
return if none is needed.Configuration Usage
The.create()
method of the Circuit Breaker Factory works without any arguments; but,
you can provide a hash of settings that will be used to generate the Circuit Breaker.
Every one of the following settings is optional:id
- The unique identifier of the underlying state instance, which is used for
requestTimeout
- The time (in milliseconds) that a pending request is allowed to hang
volumeThreshold
- The number of requests that have to be completed (within the
failureThreshold
- The percent (in whole numbers) of failures that can occur in the
activeThreshold
- The number of concurrent requests that can hang (ie, not complete)
isFailure
- The function that determines if the given failure is an error; or, if
fallback
- The global fallback to be used for all executions in the Circuit Breaker
monitor
- The monitor -- Function or instance -- for external logging (ex, StatsD logging).bucketCount
- The number of buckets to be used to collect rolling stats in the
bucketDuration
- The duration (in milliseconds) of each bucket within the rolling
NOTE: The duration of the rolling metrics window will be
bucketCount * bucketDuration
.
This is also the amount of time that the Circuit Breaker will remain opened after
failing before allowing a "health check" request to execute.var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
requestTimeout: 5000,
volumeThreshold: 10,
failureThreshold: 10, // Percent (as in 1 failure in 10 responses trips the circuit).
activeThreshold: 50,
isFailure: function( error ) {
return( ! is404( error ) );
},
fallback: { /* Fallback value. */ },
monitor: function( eventType, eventData ) {
console.log( eventType, eventData );
},
bucketCount: 30,
bucketDuration: 1000
});
Fallback Values
The primary goal of the Circuit Breaker is to "fail fast" if the upstream resource appears to be unhealthy. However, the secondary goal of the Circuit Breaker is to provide a better user experience. That means that if a meaningful fallback value can be provided in the case of error, the Circuit Breaker will facilitate this approach.The fallback value can be a Function, a Promise, or any static value. If it's a Function, it should return either a Promise or a static value. Fallback values can be defined when the Circuit Breaker is created:
var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
fallback: { /* Fallback value. */ }
});
But, they can also be provided at the time of execution (regardless of whether or not a global fallback value was provided):
var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
fallback: { /* Fallback value. */ }
});
circuitBreaker
.execute(
function() {
throw( new Error( "Network Error" ) );
},
{ /* Local fallback value. */ }
)
.then(
function( result ) {
console.log( result ); // Will be LOCAL fallback value.
}
)
;
If the fallback value is a Function and the execution was provided with a context and arguments, the same context and arguments will be used to invoke the Fallback.
Circuit Breakers Are Scary -- What If I Get It Wrong?
To be honest, it can be scary - the idea of putting something into production that will purposefully block calls to proxied systems. If you pick an error threshold that's too low, you may start blocking requests too quickly. If you pick an active threshold that's too high, you may clobber the upstream resource.Luckily, you don't have to dive right into the deep-end. Instead, you can deploy a passive Circuit Breaker that will log all of the traffic; but, will never fail open, no matter how unhealthy the upstream resource becomes. This way, you can spend some time passively gathering metrics about your API usage (including counts, durations, and errors) before switching over to an active Circuit Breaker with tailored settings.
Since this is a passive Circuit Breaker (that never opens), there are fewer settings:
id
- The unique identifier of the underlying state instance, which is used for
isFailure
- The function that determines if the given failure is an error; or, if
fallback
- The global fallback to be used for all executions in the Circuit Breaker
monitor
- The monitor -- Function or instance -- for external logging (ex, StatsD
var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.createPassive({
id: "Remote API",
monitor: function logEvent( eventType, eventData ) {
// Log statsD metrics about count and duration.
// Log errors.
}
});
// This error will result in a rejected promise; but, the Circuit Breaker will always
// remain closed, allowing requests to be executed.
circuitBreaker.execute(
function() {
throw( new Error( "Network Error" ) );
}
);
Once you've had a chance to monitor your Circuit Breakers, you can start switching your
.createPassive()
factory calls with .create()
factory calls using settings that you
know correspond to the collected base-line of metrics. And, you can sleep well at night.Logging And Monitoring
By default, the Circuit Breaker quietly discards all internal events. However, you will probably want to log Errors and record StatsD metrics in your application. To do this, you can provide a logging Function as themonitor
argument:var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
monitor: function logEvent( eventType, eventData ) {
console.log( eventType, eventData );
}
});
This logging Function will be called with the following
eventType
values:closed
passingeventData
properties{ stateSnapshot }
execute
passingeventData
properties{ stateSnapshot }
emit
passingeventData
properties{ stateSnapshot }
failure
passingeventData
properties{ stateSnapshot, duration, error }
fallbackEmit
passingeventData
properties{ stateSnapshot }
fallbackFailure
passingeventData
properties{ stateSnapshot, error }
fallbackMissing
passingeventData
properties{ stateSnapshot }
fallbackSuccess
passingeventData
properties{ stateSnapshot }
opened
passingeventData
properties{ stateSnapshot }
shortCircuited
passingeventData
properties{ stateSnapshot, error }
success
passingeventData
properties{ stateSnapshot, duration }
timeout
passingeventData
properties{ stateSnapshot, duration, error }
Under the hood, this is actually using your
logEvent()
Function to complete a concrete
implementation of the AbstractLoggingMonitor
. If you don't provide a Function, you can
provide a Class that extends either the Monitor
class or the AbstractLoggingMonitor
class. If you extend the AbstractLoggingMonitor
base class, you only have to override
the logEvent()
method:var AbstractLoggingMonitor = require( "@bennadel/circuit-breaker" ).AbstractLoggingMonitor;
var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
class MyMonitor extends AbstractLoggingMonitor {
constructor( statsD ) {
super();
this._statsD = statsD;
}
logEvent( eventType, eventData ) {
stats.increment( `circuit-breaker.${ eventType }` );
}
}
// ....
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
monitor: new MyMonitor( stats )
});
However, if you extend the
Monitor
class, you can override any of the log*
methods:var CircuitBreakerFactory = require( "@bennadel/circuit-breaker" ).CircuitBreakerFactory;
var Monitor = require( "@bennadel/circuit-breaker" ).Monitor;
class MyMonitor extends Monitor {
logClosed( stateSnapshot ) {
/* ... */
}
logOpened( stateSnapshot ) {
/* ... */
}
}
// ....
var circuitBreaker = CircuitBreakerFactory.create({
id: "Remote API",
monitor: new MyMonitor()
});
The
Monitor
class provides the following default, no-op (No Operation) methods, which
means you only have to override the ones that are meaningful to your application:logClosed( stateSnapshot )
-- I log the point at which the Circuit Breaker state
logExecute( stateSnapshot )
-- I log the point at which the execution is accepted by
logEmit( stateSnapshot )
-- I log the point at which the request has entered the
logFailure( stateSnapshot, duration, error )
-- I log the point at which the
logFallbackEmit( stateSnapshot )
-- I log the point at which a non-successful
logFallbackFailure( stateSnapshot, error )
-- I log the point at which an existing
logFallbackMissing( stateSnapshot )
-- I log the point at which a failed execution
logFallbackSuccess( stateSnapshot )
-- I log the point at which a fallback value has
logOpened( stateSnapshot )
-- I log the point at which the Circuit Breaker state
logShortCircuited( stateSnapshot, error )
-- I log the point at which an execution
logSuccess( stateSnapshot, duration )
-- I log the point at which an execution has
logTimeout( stateSnapshot, duration, error )
-- I log the point at which a long-
The
stateSnapshot
object passed to the Monitor
methods (and to the
AbstractLoggingMonitor
logEvent()
method) contains identification and metric
information about the State being used to power the Circuit Breaker. Since one Circuit
Breaker can share state with another Circuit Breaker, there's not too much sense in
identifying the Circuit Breakers themselves; as such, the State becomes the meaningful
information for logging and monitoring. Each stateSnapshot
provided by the default
implementation uses the following structure:{
"id": "Circuit Breaker for API",
"closed": true,
"settings": {
"requestTimeout": 0,
"volumeThreshold": 0,
"failureThreshold": 0,
"activeThreshold": 0
},
"metrics": {
"emit": 0,
"execute": 0,
"success": 0,
"failure": 0,
"timeout": 0
},
"totalMetrics": {
"emit": 0,
"execute": 0,
"success": 0,
"failure": 0,
"timeout": 0
},
"current": {
"activeRequestCount": 0
}
}
Building Your Own State
Implementation
The Circuit Breaker is designed to be a composite of several different classes all
working together to accomplish one goal. The reason for this composition was to allow
custom implementations to be designed if desired. Ideally, if you want a custom
implementation, the only class you should have to provide is the State
class. The
CircuitBreaker
class manages the control-flow; but, it uses the State
implementation
to power that control-flow. If you want to provide your own State
implementation, you
have to provide a class that exposes the following methods:canPerformHealthCheck()
getSnapshot()
isOpened()
isClosed()
getTimeout()
trackExecute()
trackEmit()
trackFailure( duration, error )
trackFallbackEmit()
trackFallbackFailure( error )
trackFallbackMissing()
trackFallbackSuccess()
trackShortCircuited( error )
trackSuccess( duration )
trackTimeout( duration, error )
What you do inside these methods is completely up to you. But, they have to exist since the
CircuitBreaker
is going to call them. The general control-flow for the
CircuitBreaker
follows this plan:- Top-level
execute*()
method is called. - Call
trackEmit()
. - Check to see if
isOpened()
. - If opened:
- Check to see if
canPerformHealthCheck()
- If can perform health check, proceed to execution.
- Check to see if
- If can execute command:
- Call
trackExecute()
. - Setup timeout timer using
getTimeout()
. - Invoke underlying command.
- Call
- On resolution:
- Call
trackSuccess()
.
- Call
- On rejection:
- Check type of error:
- If
OpenError
calltrackShortCircuited()
. - If
TimeoutError
calltrackTimeout()
. - Otherwise call
trackFailure()
.
- If
- Call
trackFallbackEmit()
. - Check to see if a fallback was provided (locally or globally).
- If fallback was provided:
- Execute fallback.
- On resolution:
- Call
trackFallbackSuccess()
.
- Call
- On rejection:
- Call
trackFallbackFailure()
.
- Call
- If no fallback was provided:
- Call
trackFallbackMissing()
.
- Call
- If fallback was provided:
- Check type of error:
Once you have a custom
State
implementation, you can construct a CircuitBreaker
:var CircuitBreaker = require( "@bennadel/circuit-breaker" ).CircuitBreaker;
var state = new CustomStateImplementation();
var circuitBreaker = new CircuitBreaker( state [, globalFallback] );
Guarantees Around Synchronous State Tracking
Since the Circuit Breaker generates and returns Promises around the execution of black- boxed commands, many of the methods on theState
instance will be invoked
asynchronously. However, the following series of methods are guaranteed to be invoked
synchronously within the same tick of the Node.js event loop:trackEmit()
isOpened()
canPerformHealthCheck()
- called only ifisOpened()
returnstrue
.trackExecute()
Since Node.js runs in a single process, you can assume that these four methods will be called without any race conditions.
All Metrics Should Be Stored In-Memory
If you are building your ownState
implementation, you may be tempted to share metrics
across different Node.js processes (or machines). For example, you may be tempted to
store metrics in a shared Redis instance that can be consumed be every instance of a
Circuit Breaker that proxies a single resource. DO NOT DO THIS. Not only does the
Circuit Breaker expect State
methods to run synchronously; but, trying to share state
offers no real value-add. Since the failure-tracking is based on percentages, sharing
state won't make the percentages more accurate. In fact, sharing state across processes
could lead to false-positives if a particular process or machine is having issues (such
as configuration issues that don't affect other processes or machines).Package Exports
This Circuit Breaker package exports the following public members:OpenError
StateError
TimeoutError
Metrics
AbstractLoggingMonitor
Monitor
State
CircuitBreaker
CircuitBreakerFactory