Introducing ASP.NET HealthChecks.Extensions - Conditional Health Checks

Conditional Health Checks


ASP.NET Core offers the Health Checks Middleware, in order to check the health state of your .NET Core API. However is not always desirable to run all the registered Health Checks in every context.

For example, in environments like the development one, some dependencies might not be available. Other dependencies might be used later during the application lifetime, maybe when their configuration is finished, or after a feature flag is enabled.

Would be nice to decide when to run a health check. One possible way is to write the condition in the Startup class, as bellow:

The Redis health check is added only when the configuration setting has the expected value. If the configuration is changed during the application lifetime, then the only way to re-add the Redis health check is by restarting the API, in order to call again the ConfigureServices method.

Another good reason to switch on or off a health check is when a feature uses one or maybe more dependencies. If that feature is not yet enabled then there is no reason to check the health of those dependencies and to report an overall unhealthy status although all the other features are just fine. Once the feature is enabled, then the health checks should report as expected. This cannot be implemented in the Startup class like above, as runtime changes aren't detected and re-evaluated.

If you roll your own health checks, then you could write an if statement and that's it, but there are many implementations already and the code cannot be changed whenever we want. To reuse the existing Health Checks implementations like AspNetCore.Diagnostics.HealthChecks or Microsoft.Extensions.Diagnostics.HealthChecks.EntityFrameworkCore, then something else must be done. Luckily there is a way to decorate them with our own code, and for this I created HealthChecks.Extensions

Conditional Health Checks are part of this package and leverage the possibility to decide when a health check should be actually performed. They encapsulate a boolean result, based on which the health check proceeds or not. If the health check does not proceed, then its tags list will contain a specific tag and the health result will be Healthy.

With Conditional Health Checks the condition could be expressed in several ways: as a boolean value, as a predicate, or using a strongly typed policy. The checking condition could be associated with one or more Health Checks.

The simplest use case is for a dependency that is available in a certain environment. Environments don't change during the application lifetime, so we can use Conditional Health Checks to enable a Health Check with a boolean value like this:


The Redis health check is added using the AddRedis(AspNetCore.Diagnostics.HealthChecks). The CheckOnlyWhen(HealthChecks.Extensions) configures the Redis health check to run on an environment other than the development one. CheckOnlyWhen knows about the Redis health check registration via the Registrations.Redis string, the actual name used by AspNetCore.Diagnostics.HealthChecks when it implemented the Redis Health Checks. CheckOnlyWhen finds this health check registration by this name and decorates it with the given condition, so if the condition is true it will run the actual Redis health check, otherwise, it won't run it.

A different use case is when it is desired to switch the health check on configuration changes.


The health check on MyHealthCheck happens only when the value of the configuration setting MyHealthCheck:Setting is updated with an actual string. The API doesn't have to be restarted, as the predicate is evaluated every time before the actual execution of the health check code.

Another use case is when the predicate result is provided by a service. A service that resolves features flags, answering a question like if a feature flag is set or not, can be used by CheckOnlyWhen as it has an overload to support this scenario:

This is possible because the ServiceProvider is accessible via the sp argument. The HealthCheckContext of the current health check policy can also be used, and an optional CancellationToken to pass it down the stream, a practice that I strongly suggest.

The conditions could complex and often need to be reused across different health checks configurations. For this scenario consider using the strongly typed health checks.  A strongly typed health check is a class that implements the IConditionalHealthCheckPolicy interface.

The policies can be designed as any other class, with no restrictions. They can have collaborators, as all are resolved by the ServiceProvider itself. Even more, the policies can have extra arguments in their constructor.

Reusing the same feature flags scenario, one could implement a solution like the following 

This feature flags implementation is not very smart, as all are disabled for now. But what matters here is we can use this to create a conditional health check policy:

The FeatureFlagsPolicy's constructor receives a flag name and the required dependencies to fulfill its functionality. The name argument is used by the IFeatureFlags collaborator to see if the corresponding flag is set or not, so the Evalute method will return this result.

 What's left is to register the FeatureFlagsPolicy to work with the desired health check.

This works because the FeatureFlagsPolicy objects are created with the help of the ServiceProvider, so all its dependencies must be first registered in the services collection. What ServiceProvider cannot solve is the name of the flag, so this is why it is conveniently configured by the developer using the conditionalHealthCheckPolicyCtorArgs argument. The experience is very similar to registering a health check itself, actually, this feature is shamelessly borrowed from here. There is also an overload to create it yourself and in those overloads you are passed the ServiceProvider and the HealthCheckContext.

There is also the option to associate multiple health checks with one single condition:

Both Redis and RabitMQ health checks are executed only when the BlackFridayPromotion is active.

If you override your own ResponseWriter, like I did in the samples project supporting this package, in order to write all the necessary details when you access the /health, then a possible result could be the following one:

Redis was not run here, and we can tell because the Redis entry contains the NotChecked tag. This is useful for observability so based on it other systems could take different decisions. The health check status, as I mentioned before, is by default Healthy, but both the status and the tag could be configured to be based on your needs. This is done by providing an options argument with the CheckOnlyWhen call:

When the RabitMQ health check is disabled then the tags list contains the NotActive tag and the status is HealthStatus.Degraded.

Health checks are an essential feature for observability, but I feel like there also they should be non-intrusive, non-invasive, and flexible. Currently, there is no built in support to switch on/off the ASP.NET Core health checks, but with Conditional Health Checks from the HealthChecks.Extensions there is now a flexible way to add more value in this direction.

Comments

Popular posts from this blog

IIS 7.5, HTTPS Bindings and ERR_CONNECTION_RESET

Table Per Hierarchy Inheritance with Column Discriminator and Associations used in Derived Entity Types

Verify ILogger calls with Moq.ILogger