gprd has many artifact files on disk that are not present in azure

I noticed this while working on #308 (closed)

Geo was tracking > 6 million rows in ci_job_artifacts that had been migrated to object storage on the primary. We expected https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/4689 to remove these files from disk, but because we had to manually clear the job_artifact_registry table, this won't happen.

shared/artifacts is currently 18TiB in size on gprd. Almost none of this is being referenced - almost all of it is in object storage instead.

We should remove every file in shared/artifacts and clear the job_artifact_registry table again. This should happen before we do a planned failover of GitLab.com to gprd, so we can resync the few artifacts we do care about.

/cc @ayufan @andrewn @jramsay

Edited Apr 16, 2018 by Nick Thomas
Assignee Loading
Time tracking Loading