Learning by doing - Implementing Redis distributed cache health check for ASP.NET Core

Final release of ASP.NET Core 2.2 is getting closer and I've started devoting some time to get familiar with new features and changes. One of new features, which extends ASP.NET Core diagnostics capabilities, are health checks. Health checks aim at providing way to quickly determine application condition by an external monitor (for example container orchestrator). This is valuable and long awaited feature. It's also significantly changing from preview to preview. I wanted to grasp current state of this feature and I couldn't think of better way to do it, than in context of concrete scenario.

Setting up the stage

One of external resources, which is often a dependency of my applications, is Redis. Most of the time this comes as a result of using distributed cache.

public class Startup
{
    public IConfiguration Configuration { get; }

    public Startup(IConfiguration configuration)
    {
        Configuration = configuration;
    }

    public void ConfigureServices(IServiceCollection services)
    {
        services.AddDistributedRedisCache(options =>
        {
            options.Configuration = Configuration["DistributedRedisCache:Configuration"];
            options.InstanceName = Configuration["DistributedRedisCache:InstanceName"];
        });

        ...
    }

    ...
}

In order to make sure that application, which uses Redis backed distributed cache, works as it's supposed to, two things should be checked. One is presence of configuration and second is availability of Redis instance. Those are two health checks I would want to create.

In general health checks are represented by HealthCheckRegistration which requires providing an instance of a health check implementation or a factory of such instances. Working with HealthCheckRegistration all the time would result in a lot of repetitive, boilerplate code. This is probably the reason why there are two sets of extensions for registering health checks: HealthChecksBuilderDelegateExtensions and HealthChecksBuilderAddCheckExtensions. Those extensions divide health checks in two types.

"Lambda" health checks

Simpler type are "lambda" checks. To register such check one needs to call AddCheck (or AddAsyncCheck for asynchronous version) which takes function as a parameter. This function needs to return HealthCheckResult. An instance of HealthCheckResult carries a true or false value (which maps to failed or passed) and optionally a description, exception and additional data. The value will be used to determine the status of health check, while optional properties will end up in logs.

This type of health checks is perfect for small pieces of logic, in my case for checking if the configuration is present.

public class Startup
{
    ...

    public void ConfigureServices(IServiceCollection services)
    {
        ...

        services.AddHealthChecks()
            .AddCheck("redis-configuration", () => 
                (String.IsNullOrWhiteSpace(Configuration["DistributedRedisCache:Configuration"])
                 || String.IsNullOrWhiteSpace(Configuration["DistributedRedisCache:InstanceName"]))
                ? HealthCheckResult.Failed("Missing Redis distributed cache configuration!")
                : HealthCheckResult.Passed()
            );

        ...
    }

    ...
}

But often health checks can't be that simple. They may have complex logic, they may require services available from DI or (worst case scenario) they may need to manage state. In all those cases it's better to use second type of checks.

"Concrete Class" health checks

An alternative way to represent a health check is through a class implementing IHealthCheck interface. This gives a lot more flexibility, starting with access to DI. This ability is crucial for the second health check I wanted to implement - Redis instance connection check. Checking the connection requires acquiring RedisCacheOptions and initializing ConnectionMultiplexer based on that options. If the connection is created with AbortOnConnectFail property set to false, ConnectionMultiplexer.IsConnected can be used at any time to get current status. Below code shows exactly that.

public class RedisCacheHealthCheck : IHealthCheck
{
    private readonly RedisCacheOptions _options;
    private readonly ConnectionMultiplexer _redis;

    public RedisCacheHealthCheck(IOptions<RedisCacheOptions> optionsAccessor)
    {
        ...

        _options = optionsAccessor.Value;
        _redis = ConnectionMultiplexer.Connect(GetConnectionOptions());
    }

    public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context,
        CancellationToken cancellationToken = default(CancellationToken))
    {
        return Task.FromResult(_redis.IsConnected
            ? HealthCheckResult.Passed()
            : HealthCheckResult.Failed("Redis connection not working!")
        );
    }

    private ConfigurationOptions GetConnectionOptions()
    {
        ConfigurationOptions redisConnectionOptions = (_options.ConfigurationOptions != null)
            ? ConfigurationOptions.Parse(_options.ConfigurationOptions.ToString())
            : ConfigurationOptions.Parse(_options.Configuration);

        redisConnectionOptions.AbortOnConnectFail = false;

        return redisConnectionOptions;
    }
}

A health check in form of a class can be registered with call to AddCheck<T>.

public class Startup
{
    ...

    public void ConfigureServices(IServiceCollection services)
    {
        ...

        services.AddHealthChecks()
            .AddCheck("redis-configuration", ...)
            .AddCheck<RedisCacheHealthCheck>("redis-connection");

        ...
    }

    ...
}

What about the state? Attempting to monitor Redis connection calls for some state management. ConnectionMultiplexer is designed to be shared and reused. It will react properly to connection failing and it will restore it when possible. It would be good not to create a new one every time. Unfortunately current implementation does exactly that. Registering health check with AddCheck<T> makes it transient. There is no obvious way to make health check a singleton (which is probably a good thing as it makes it harder to commit typical singleton mistakes). This can be done by manually creating HealthCheckRegistration. One can also play with registering IHealthCheck implementation directly with DI. But I would suggest different approach - moving state management to dedicated service and keeping health check as transient. This keeps the implementation clean and responsibilities separated. In case of Redis connection check, this means extraction of ConnectionMultiplexer creation.

public interface IRedisCacheHealthCheckConnection
{
    bool IsConnected { get; }
}

public class RedisCacheHealthCheckConnection : IRedisCacheHealthCheckConnection
{
    private readonly RedisCacheOptions _options;
    private readonly ConnectionMultiplexer _redis;

    public bool IsConnected => _redis.IsConnected;

    public RedisCacheHealthCheckConnection(IOptions<RedisCacheOptions> optionsAccessor)
    {
        ...

        _options = optionsAccessor.Value;
        _redis = ConnectionMultiplexer.Connect(GetConnectionOptions());
    }

    private ConfigurationOptions GetConnectionOptions()
    {
        ...
    }
}

Now the health check can be refactored to use the new service.

public class RedisCacheHealthCheck : IHealthCheck
{
    private readonly IRedisCacheHealthCheckConnection _redisConnection;

    public RedisCacheHealthCheck(IRedisCacheHealthCheckConnection redisConnection)
    {
        _redisConnection = redisConnection ?? throw new ArgumentNullException(nameof(redisConnection));
    }

    public Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context,
        CancellationToken cancellationToken = default(CancellationToken))
    {
        return Task.FromResult(_redisConnection.IsConnected
            ? HealthCheckResult.Passed()
            : HealthCheckResult.Failed("Redis connection not working!")
        );
    }
}

Last thing remaining is adding the new service to DI as singleton.

public class Startup
{
    ...

    public void ConfigureServices(IServiceCollection services)
    {
        ...

        services.AddSingleton<IRedisCacheHealthCheckConnection, RedisCacheHealthCheckConnection>();

        services.AddHealthChecks()
            .AddCheck("redis-configuration", ...)
            .AddCheck<RedisCacheHealthCheck>("redis-connection");

        ...
    }

    ...
}

The health checks are now in place, what remains is an endpoint which external monitor will be able to use to determine state.

Exposing health checks

Setting up an endpoint is as simple as calling UseHealthChecks and providing path.

public class Startup
{
    ...

    public void Configure(IApplicationBuilder app, IHostingEnvironment env)
    {
        ...

        app.UseHealthChecks("/healthcheck");

        ...
    }
}

But on many occasions we may want for health checks to be divided into groups. Most typical example would be separating into liveness and readiness. ASP.NET Core makes this easy as well. One of optional parameters for health checks registration is tags enumeration. Those tags can be later used to filter health checks while configuring an endpoint.

public class Startup
{
    ...

    public void ConfigureServices(IServiceCollection services)
    {
        ...

        services.AddHealthChecks()
            .AddCheck("redis-configuration", ..., tags: new[] { "liveness" })
            .AddCheck<RedisCacheHealthCheck>("redis-connection", tags: new[] { "readiness" });
    }

    public void Configure(IApplicationBuilder app, IHostingEnvironment env)
    {
        ...

        app.UseHealthChecks("/healthcheck-liveness", new HealthCheckOptions
        {
            Predicate = (check) => check.Tags.Contains("liveness")
        });

        app.UseHealthChecks("/healthcheck-readiness", new HealthCheckOptions
        {
            Predicate = (check) => check.Tags.Contains("readiness")
        });

        ...
    }
}

Anything else?

Not all health checks have the same meaning. By default returning false value results in unhealthy status, but that may not be desired. While registering health check, the meaning of failure can be redefined. Considering the scenario in this post, it's possible for an application which is using Redis backed distributed cache to still work if Redis instance is unavailable. If that's the case, the failure of connection check shouldn't mean that status is unhealthy, more likely the status should be considered degraded.

public class Startup
{
    ...

    public void ConfigureServices(IServiceCollection services)
    {
        ...

        services.AddHealthChecks()
            .AddCheck("redis-configuration", ..., tags: new[] { "liveness" })
            .AddCheck<RedisCacheHealthCheck>("redis-connection",
                failureStatus: HealthStatus.Degraded, tags: new[] { "readiness" });
    }

    ...
}

This change will be reflected in value returned by endpoint.

So for a theoretically simple mechanism, health checks provide a lot of flexibility. I can see them being very useful for number of scenarios.