Replace the repository download cache safety check
Next steps based on #1255 (comment 84326152):
- Document in gitlab.yml and the gitlab.rb template that the download cache directory is periodically emptied by gitlab. Administrators should be careful if they change the default because GitLab can end up deleting files you don't want to be deleted.
- Tweak
find
command in RepositoryArchiveCacheWorker to exclude*.git
directories - Remove the initializer check that tries to compare repository storage paths with the download cache directory
Original issue text:
In config/initializers/1_settings.rb we have a validation that checks if the repository_downloads_path
does not contain any of the repository storage paths. We have this check since https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/5285. It was added because all files under repository_downloads_path
get deleted after some time, while repository storage paths contain data that should not be deleted at all (namely Git repositories stored in GitLab).
The defaults of these two settings do not overlap. But, somebody on a self-hosted installation made the unfortunate mistake of configuring repository_downloads_path
so that it overlapped with repository storage, which lead to deleted repositories on their GitLab instance (data loss).
Once the Gitaly migration project is done, this check becomes impossible. The check happens in gitlab-ce but gitlab-ce will not 'know' what the the repository storage paths used by Gitaly are. We have to remove it one way or another. In this issue I would like to discuss what comes in its place, if anything.
cc @grzesiek @dbalexandre (who were involved in creating the check)