Skip to content

Hashed Storage attachments migration: exclude files in object storage as they are all hashed already

What does this MR do?

  • Adds test coverage to existing migration behavior
  • Adds one optimization to avoid running the migration if it has already been migrated (this is already the current behavior, but we are stopping it way before now).
  • Our queries exclude files that are in Object Storage, as they already are hashed, so there is nothing that needs to be done there.

Database queries

puts Gitlab::HashedStorage::RakeHelper::hashed_attachments_relation.to_sql

SELECT "uploads".*
FROM "uploads"
         INNER JOIN "projects"
                    ON "uploads"."model_type" = 'Project' AND "uploads"."model_id" = "projects"."id" AND
                       "uploads"."store" = 1
WHERE "projects"."storage_version" >= 2;

Plan: https://explain.depesz.com/s/n3Zb

puts Gitlab::HashedStorage::RakeHelper::legacy_attachments_relation.to_sql

SELECT "uploads".*
FROM "uploads"
         JOIN projects
              ON (uploads.model_type = 'Project' AND uploads.model_id = projects.id AND uploads.store = 1)
WHERE (projects.storage_version < 2 OR projects.storage_version IS NULL)

Plan: https://explain.depesz.com/s/7oR0

Conformity

Related to #35789 (closed)

Edited by Mayra Cabrera

Merge request reports