Geo: Unhealthy shards should be flagged as a Prometheus metric
I was scratching my head why this was happening:
It turns out that nfs-09 and nfs-12 on GPRD were down, and our scheduler was doing the right thing and not trying to sync repositories on those shards:
irb(main):038:0> p = Project.find(4662084)
irb(main):039:0> Gitlab::Geo::ShardHealthCache.healthy_shard?(p.repository_storage)
=> false
We should export metrics for the health of each shard for Gitaly.