Using consul for global state, deployment events, etc
@nolith raised this. GitLab.com already runs consul across the cluster, but we only use it for limited purposes. The proposal is to use consul for storing key-value pairs around infrastructure events, such as canary-drain, rolling deployments etc. Using `consul_exporter`, which we have already deployed across the fleet, we could integrate selected values with prometheus using the `kv.filter` and `kv.prefix` configuration options: https://github.com/prometheus/consul_exporter#keyvalue-checks With this state, we could improve the accuracy of our alerts, for example by disabling alerts on our canary nodes when the canary is drained, or (for example) allowing a slightly elevated error rate during deployments.
issue