Hashed Storage attachments migration: exclude files in object storage as they are all hashed
With current Hashed Storage migration script, we consider all uploads associated with a legacy project, while we should instead consider only local uploads, as the ones in object storage are always hashed and therefore, don't need to be migrated.
Here is an example of the output:
{
"severity": "INFO",
"time": "2019-11-06T17:21:37.257Z",
"class": "HashedStorage::ProjectMigrateWorker",
"retry": 3,
"queue": "hashed_storage:hashed_storage_project_migrate",
"queue_namespace": "hashed_storage",
"jid": "f44445bc1610fc617d7d1ca8",
"created_at": "2019-11-06T17:19:46.492987Z",
"correlation_id": "d3883b1a-76bd-461d-b468-781f0bee9b68",
"enqueued_at": "2019-11-06T17:21:37.255980Z",
"error_message": "No such file or directory @ rb_file_s_rename - (/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56, /opt/gitlab/embedded/service/gitl
ab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56)",
"error_class": "Errno::ENOENT",
"failed_at": 1573060786.5137305,
"retry_count": 1,
"retried_at": 1573060857.646291,
"pid": 2552,
"message": "HashedStorage::ProjectMigrateWorker JID-f44445bc1610fc617d7d1ca8: start",
"job_status": "start",
"scheduling_latency_s": 0.0013
}
{
"severity": "INFO",
"time": "2019-11-06T17:21:37.263Z",
"message": "Skipped attachments move from '/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56' to '/opt/gitlab/embedded/service/gitlab-rails/public/up$
oads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56', source path doesn't exist or is not a directory (PROJECT_ID=297)"
}
The solution is simple to fix, we need to consider the correct store
on the uploads model when filtering.
Manual test / validation:
- Enable uploads to Object Storage
- Disable Hashed Storage
- Restart GitLab
- Run
gitlab-rake gitlab:storage:hashed_attachments
and save the count to be compared later - Create a new project
- Create a new issue
- Upload something to that new issue
- Run
gitlab-rake gitlab:storage:legacy_attachments
(should return 0) - Run
gitlab-rake gitlab:storage:list_legacy_attachments
(should be empty) - Run
gitlab-rake gitlab:storage:list_hashed_attachments
(should include the recently uploaded file) - Run
gitlab-rake gitlab:storage:hashed_attachments
(count should include the file, compare with previous run) - Run
gitlab-rake gitlab:storage:migrate_to_hashed
log should not include failure - Check in rails console if
Project.storage_version == 2
Edited by Gabriel Mazetto