Geo: Redirect/proxy to primary if repo is out-of-date
Problem
Imagine you work on an active repo on a secondary, and due to a Sidekiq restart, the repo sync lease is orphaned, like #5267. After a couple hours, you pull master, locally merge a branch into it, and push. But the push is rejected because your master is not up-to-date. You are confused for a bit, ask around, and discover that that repo is no longer syncing on the secondary. This happened once before.
So you set origin to the primary, fetch, hard reset master, merge the branch, push, and never think about it again, except to make sure that you always clone directly from the primary.
This is "workaround acceptable", but undermines Category:Geo Replication.
Proposal 1: Acceptable latency
Automatically redirect/proxy to primary if repo is too out-of-date past a threshold. This kind of extends the idea behind !27072 (merged). Things will work even if something is not quite ideal.
- We need to know how out-of-date any particular repo is, which may be addressed in #197147
- Redirect/proxy to primary if repo_lag > acceptable_repo_lag
- Later: Make acceptable_repo_lag configurable
Proposal 2: Strict
Automatically redirect/proxy to primary if repo is out-of-date.
- On every Git pull, ask the primary for the latest SHA. If mismatch, then redirect/proxy.
Downside: Busy repos and/or high latency connections may cause secondaries to be useless.
Question: Is it possible for a secondary to fetch from the primary while serving a Git request?