Skip to content

feat: add command and pool stat Prometheus metrics for Redis cache

João Pereira requested to merge redis-cache-metrics into redis-cache-api

This is the last MR for #461 (closed). It adds Prometheus metrics for Redis commands and (application-side) connection pool stats.

The command metrics are collected by https://github.com/globocom/go-redis-prometheus. This does all we need as is well tested, so there is no need to reinvent the wheel.

The connection pool stats use a custom metrics collector. This will live within the registry codebase for now, but is likely to be extracted to LabKit once stable, just like it happened with the SQL stats collector (labkit!90 (merged)).

Here is a sample of all collected metrics:

# HELP registry_redis_pool_stats_hits The number of times a free connection was found in the pool.
# TYPE registry_redis_pool_stats_hits gauge
registry_redis_pool_stats_hits{instance="cache"} 11
# HELP registry_redis_pool_stats_idle_conns The number of idle connections in the pool.
# TYPE registry_redis_pool_stats_idle_conns gauge
registry_redis_pool_stats_idle_conns{instance="cache"} 1
# HELP registry_redis_pool_stats_misses The number of times a free connection was not found in the pool.
# TYPE registry_redis_pool_stats_misses gauge
registry_redis_pool_stats_misses{instance="cache"} 1
# HELP registry_redis_pool_stats_stale_conns The number of stale connections removed from the pool.
# TYPE registry_redis_pool_stats_stale_conns gauge
registry_redis_pool_stats_stale_conns{instance="cache"} 0
# HELP registry_redis_pool_stats_timeouts The number of times a wait timeout occurred.
# TYPE registry_redis_pool_stats_timeouts gauge
registry_redis_pool_stats_timeouts{instance="cache"} 0
# HELP registry_redis_pool_stats_total_conns The total number of connections in the pool.
# TYPE registry_redis_pool_stats_total_conns gauge
registry_redis_pool_stats_total_conns{instance="cache"} 1
# HELP registry_redis_single_commands Histogram of single Redis commands
# TYPE registry_redis_single_commands histogram
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.001"} 5
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.005"} 6
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.01"} 8
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.025"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.05"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.1"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.25"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="0.5"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="1"} 9
registry_redis_single_commands_bucket{command="get",instance="cache",le="+Inf"} 9
registry_redis_single_commands_sum{command="get",instance="cache"} 0.039420858
registry_redis_single_commands_count{command="get",instance="cache"} 9
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.001"} 0
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.005"} 0
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.01"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.025"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.05"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.1"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.25"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="0.5"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="1"} 1
registry_redis_single_commands_bucket{command="ping",instance="cache",le="+Inf"} 1
registry_redis_single_commands_sum{command="ping",instance="cache"} 0.007097752
registry_redis_single_commands_count{command="ping",instance="cache"} 1
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.001"} 1
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.005"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.01"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.025"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.05"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.1"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.25"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="0.5"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="1"} 2
registry_redis_single_commands_bucket{command="set",instance="cache",le="+Inf"} 2
registry_redis_single_commands_sum{command="set",instance="cache"} 0.00176013
registry_redis_single_commands_count{command="set",instance="cache"} 2

Related to #461 (closed).

Marked as a draft as this is still missing unit tests.

Edited by João Pereira

Merge request reports