Bake (code) Resiliency in your Application

5 min readJan 10, 2020

This article is primarily focused for applications developed in Java (preferably Vs 11 otherwise at least vs 8) with Spring-boot (Vs 2.2.x) framework with the assumption that the readers are familiar with Micro Services Resiliency Patterns.

“The foundation of Application Resiliency is often found in the Application code where the Fault Tolerance is baked into the application while designing and coding….”

Let’s start with Building the resiliency ground-up by baking the basic but crucial patterns into the application.

The Resiliency4j provides the out of the box integration with Spring Boot (which is a defacto Standard in implementing the Services these days.

The patterns Discussed here are as follows:

a. Rate Limiter

b. Retry

c. Bulkhead

d. Circuit Breaker

e. Timeouts

Lets get into the implementation details of the patterns….

Rate limiter Design pattern

Rate Limiter design pattern is essentially the implementation of the ‘Throttling Design pattern’ where the access to the Service/API is restricted to certain number of requests during a bounded period of time.

In simple terms, it’s about maintaining the Transaction Per Second (TPS) rate of an application. Where if the calculated TPS of an API is 1000 transaction/second and its bombarded with 2000 transactions/second, the back pressure would be created while maintaining the application performance by rejecting the request.

How to implement Rate limiter?

This is ideally suited at the ‘Front’ of the Service meaning at REST Controller layer of the Service offered. To begin with you would annotate the method with Rate Limiter annotation with Fallback method mentioned right in the same class.

@RateLimiter(name = “ratelimiter_instance_as_configured”)

@RateLimiter(name = “ratelimiter_instance”, fallbackMethod = “fallback_method”)

Private <return_type_as_of_API_method> fallback_method() {

//populate the Custom exception | HTTP Status Code | Error Message

}

Note: technically this method can be implemented at the plug points which you would be seeing at the default implementations but you would run into the ambiguity with the implementation of Bulkheads, if they are implemented at the same service layer as Ratelimiter as Ratelimiter has a higher precedence over Bulkhead.

Note: check out the yaml file for the configuration provided for the Ratelimiter and its instances.

Note: Plug points — Service layer where service is connected with other API’s | Services | DAO layer.

Retry Design Pattern

Retries have been made easy by the Resiliency4j lib.

Where to Implement Retries?

At plug points, in case of Failures | Exceptions | Timeouts Retries can be configured conditionally.

How to implement the Retries ?

You have the option to annotate the Class or Method for providing the Retries and it would work. My advice would be to implement the Retries at the method level to have a better clarity in the code about which services/methods have been asked for the Retries , plus, per the requirement the Retries over the method can have specific instance implementation.

“Words of Wisdom” The Retries should not be configured for every Exception, the Exceptions should be configured for which the Retry is attempted Vs NOT!

For Example :

“retryExceptions”: #The default exception predicate retries all exceptions

- org.springframework.web.client.HttpServerErrorException — —

“ignoreExceptions”:

- com.ers.portfolio.exceptions.PortfolioException

Exponential Retries — one can configure the exponential Retries using configuration, Exponential Retries is progressively longer waits between retries for consecutive failures till it reaches max retry limit ( defined in config file).

@Retry (name = “ratelimiter_instance_as_configured”)

The drawback of implementing the ‘Exponential Retries’ is that the HTTP request could timeout by the time the exponential retries have reached the limit and due to this reason, have the calculation to avoid the timeout and still have exponential retries.

The Author recommendation is to avoid configuring ‘Exponential Retries’ until you have to.

Bulkhead Design Pattern

This is the best straight forward way you would see how to implement the Bulkhead pattern. Let’s get into where and how?

How to implement Bulkhead ?

This is something which you would like to implement at the Service layers / plug points to avoid overloading the underline resources and insulating the overall Application from underline problems in one of the trouble areas of the application i.e method.

How to implement the Bulkhead ?

There are two implementations provided by the Bulkhead at Method level:

§ SemaphoreBulkhead which uses Semaphores.

§ FixedThreadPoolBulkhead which uses a bounded queue and a fixed thread pool.

For most of the cases the Semaphores implementation would be sufficient; and for those who understand the concept of Thread pools and have a valid use case should go ahead for Threadpool Bulkhead implementation.

@Bulkhead(name = “Bulkhead_instance_as_configured”)

@Bulkhead(name = “Bulkhead_instance_as_configured”, fallbackMethod = “fallback_method”)

Circuit Breaker Design Pattern

It’s a Double Edge sword pattern, Remember that will talk about that in a moment.

Implementation using Resiliency4j

Circuit breaker protect the system against transient failures of subcomponent services, Circuit Breakers decouple those that are suffering delays or timeouts to prevent cascade failures.

Howto implement Circuit breaker?

Circuit breakers can ideally be implemented both at the Class level and at the method level depending upon the needs of having a Global (all configured Circuit Breakers) vs Localized implementation (Class or Method level).

Based on the need, either at Class Level or at method level the Circuit Breaker configuration can be defined.

@ CircuitBreaker(name = “<Declared_Instance>”,fallbackMethod=”readFromCache”)

Note: the fallback method provides you with an opportunity to provide some default behavior.

In case for some temporary glitches. For Example, you would like to know the current value of a security to calculate the NAV, Since, due to some temporary glitches the current value of a Security is not available, still it can be read from Cache, if there is one and the best possible NAV can be calculated and provided on the request. The Cache would be accessed from the fallback method defined and return the Security best available value.

Implementation using Spring Circuit breaker

Remember, Spring Circuit breaker supports couple of underline implementation and underline support of Resiliency4j is one of them.

There is one Big plus which comes with both the Spring and Resiliency4j is that the Timeouts can be backed in, right with the Circuit breaker configuration.

Remember, I mentioned above that it’s a Double edge Sword, is due to the reasons that Timeouts can be baked into the configuration, plus, it acts as a Master Switch as well, if it Is implemented at the Controller layer. Last line is a Food for thoughts.

Let’s get into where and how ?

I like implementing both the ways in an application:

‘Spring Cloud Circuit Breaker ‘— at the REST Controller layer,

‘Resiliency4j Circuit Breaker’ — at the Service/Plug points.

Spring Cloud Circuit Breaker implementation :

Wrap the underline Service calls in the Circuit Breaker:

CustomCircuitBreakerInstance.run(…..)

Take a look at the class ‘PortfolioSummaryController’ for reference from sample code.

Timeout Design Pattern

Ideally , as mentioned above, Timeouts should be baked-in with the Circuit breaker implementation and should acts as your master Switch in case of underline service/resources are slow to respond.

If you are not implementing the Circuit breaker, here is an one of way using Java, the Timeouts can be configured.

How ?

Wrap the Service request with configured Timeouts:

Below impl Throws — java.util.concurrent.ExecutionException after waiting for 1 second

CompletableFuture<Integer> future = CompletableFuture.supplyAsync(this::computeLogic).orTimeout(1, TimeUnit.SECONDS);

Below Implwmwntation provide the default value after the timeout:

CompletableFuture<Integer> future = CompletableFuture.supplyAsync(this::computeLogic).completeOnTimeout(10, 1, TimeUnit.SECONDS);

Lets talk about ‘Logging’ and ‘Alerts’

The Resiliency4j inherently provide the support for Micrometer (https://micrometer.io/) to capture the incidents and have them available for alerting and logging via one of the telemetry solution implemented in your organization.

Sample Implementation

https://github.com/scalablepuppets/Resiliency4JDemo