Discussion: how to determine if a gitaly node is down

Based on a discussion with professional services, MVP for praefect failover cannot be a manual process where an SRE has to change a config file and restart praefect. The first decision we need to make then, is, "what does it mean that a gitaly node is down?"

Proposal:

A gitaly node is down if it does not respond to healthcheck for a configurable # of times.

The reason for this approach is that it matches what some customers are already doing in their DNS load balancer approach where they have a master and slave gitaly node. When the master node doesn't respond to a healthcheck for a configured # of times, the DNS load balancer will automatically switch over to the slave.

Edited by John Cai