Skip to content

Geo: Re-introduce the `Redownload` behaviour as a fallback to recover from repeated failed clone/fetch sync requests

Problem to solve

Geo used to have the ability to automatically Redownload repository if the clone/fetch request failed multiple time. The Redownload functionality uses tar to archive the repository and transfers it to the secondary Geo site. This was removed in 16.0 via an MR.

Recently we've seen some clone/fetch requests for repositories on the primary site take a long time due to the large number of keeparound refs in these repositories. Git spends a long time processing the request which results in various timeouts expiring along the call path before the request is completed by Git. We've also noticed subsequent requests to fetch changes take significantly less time to process. Reintroducing the Redownload behaviour will address the initial clone/fetch problems.

Proposal

Reintroduce the Redownload functionality. The functionality kicks in after 3 consecutive failed attempts to clone/mirror a repository.

Each time the functionality kicks in, an easily identifiable log message should be emitted so it's possible to detect this behaviour for troubleshooting purposes.

A redownload button should also be added to the project repositories replica view page next to the resync and reverify buttons for individual projects.

Intended users

Sidney (Systems Administrator)

Feature Usage Metrics

TBD

Does this feature require an audit event?

TBD