Skip to content

Hashed Storage attachments migration: exclude files in object storage as they are all hashed

With current Hashed Storage migration script, we consider all uploads associated with a legacy project, while we should instead consider only local uploads, as the ones in object storage are always hashed and therefore, don't need to be migrated.

Here is an example of the output:

{
  "severity": "INFO",
  "time": "2019-11-06T17:21:37.257Z",
  "class": "HashedStorage::ProjectMigrateWorker",
  "retry": 3,
  "queue": "hashed_storage:hashed_storage_project_migrate",
  "queue_namespace": "hashed_storage",
  "jid": "f44445bc1610fc617d7d1ca8",
  "created_at": "2019-11-06T17:19:46.492987Z",
  "correlation_id": "d3883b1a-76bd-461d-b468-781f0bee9b68",
  "enqueued_at": "2019-11-06T17:21:37.255980Z",
  "error_message": "No such file or directory @ rb_file_s_rename - (/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56, /opt/gitlab/embedded/service/gitl
ab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56)",
  "error_class": "Errno::ENOENT",
  "failed_at": 1573060786.5137305,
  "retry_count": 1,
  "retried_at": 1573060857.646291,
  "pid": 2552,
  "message": "HashedStorage::ProjectMigrateWorker JID-f44445bc1610fc617d7d1ca8: start",
  "job_status": "start",
  "scheduling_latency_s": 0.0013
}
{
  "severity": "INFO",
  "time": "2019-11-06T17:21:37.263Z",
  "message": "Skipped attachments move from '/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56' to '/opt/gitlab/embedded/service/gitlab-rails/public/up$
oads/@hashed/4c/15/4c15f47afe7f817fd559e12ddbc276f4930c5822f2049088d6f6605bec7cea56', source path doesn't exist or is not a directory (PROJECT_ID=297)"
}

The solution is simple to fix, we need to consider the correct store on the uploads model when filtering.


Manual test / validation:

  1. Enable uploads to Object Storage
  2. Disable Hashed Storage
  3. Restart GitLab
  4. Run gitlab-rake gitlab:storage:hashed_attachments and save the count to be compared later
  5. Create a new project
  6. Create a new issue
  7. Upload something to that new issue
  8. Run gitlab-rake gitlab:storage:legacy_attachments (should return 0)
  9. Run gitlab-rake gitlab:storage:list_legacy_attachments (should be empty)
  10. Run gitlab-rake gitlab:storage:list_hashed_attachments (should include the recently uploaded file)
  11. Run gitlab-rake gitlab:storage:hashed_attachments (count should include the file, compare with previous run)
  12. Run gitlab-rake gitlab:storage:migrate_to_hashed log should not include failure
  13. Check in rails console if Project.storage_version == 2
Edited by Gabriel Mazetto