Make mirroring more stable
While discussing the mirroring feature through Slack, me and @pcarranza have arrived to a few hypothesis that might help us get to the truth of why mirrors are not working reliably now.
- Available capacity was going down below 0 which should not happen and it is causing the query
LIMITto thrown an error. - There seems to exist a race condition that is making us schedule more mirrors than the max capacity. These could be fought by using a lease/semaphore to access the resource.
- Checking stuck mirrors currently looks for mirrors with 20 minutes of age or more. We should up the number a bit more.
- The duration in https://gitlab.com/gitlab-org/gitlab-ee/blob/master/app/models/project_mirror_data.rb#L43 is relying that we have a
last_update_started_atwhich might not happen if we do not get to run the mirror in the first place (it is just scheduled). We should have a lower bound to prevent that line from returning 0.
\cc @pcarranza @DouweM Any thoughts?