Export a Prometheus metric for replication queue depth
We currently measure how long it takes to execute a replication job and how many are in flight at the time of the scrape. We also measure the queueing delay until the job started to be executed. It's difficult to see from these metrics alone how well we are keeping up with the queue and how many jobs are still waiting.
To get better idea, we should expose a metric for replication queue depth in Prometheus.
cc @pks-t