Make mirroring more stable

While discussing the mirroring feature through Slack, me and @pcarranza have arrived to a few hypothesis that might help us get to the truth of why mirrors are not working reliably now.

  • Available capacity was going down below 0 which should not happen and it is causing the query LIMIT to thrown an error.
  • There seems to exist a race condition that is making us schedule more mirrors than the max capacity. These could be fought by using a lease/semaphore to access the resource.
  • Checking stuck mirrors currently looks for mirrors with 20 minutes of age or more. We should up the number a bit more.
  • The duration in https://gitlab.com/gitlab-org/gitlab-ee/blob/master/app/models/project_mirror_data.rb#L43 is relying that we have a last_update_started_at which might not happen if we do not get to run the mirror in the first place (it is just scheduled). We should have a lower bound to prevent that line from returning 0.

\cc @pcarranza @DouweM Any thoughts?

Assignee Loading
Time tracking Loading