What does this MR do?

When "Sync object storage" is disabled (which is the default):

  • Preexisting package files will no longer be backfilled if they are object stored
  • Package files (and especially their registry records) will be automatically removed if they are object stored (the same worker that backfills does this: RegistryConsistencyWorker)

E.g. on staging, the Package Files failures would go away.

Also adds an index on file_store column to improve these queries when "Sync object storage" is disabled.

Part of #224634 (closed)

Database queries

with sync object storage disabled

SELECT "packages_package_files".* FROM "packages_package_files" WHERE "packages_package_files"."file_store" = 1

with selective sync by namespace and sync object storage disabled

WITH "restricted_packages" 
AS (
  SELECT "packages_packages"."id" 
  FROM "packages_packages" 
  WHERE "packages_packages"."project_id" 
  IN (
    SELECT "projects"."id" 
    FROM "projects" 
    WHERE "projects"."namespace_id" 
    IN (
      WITH RECURSIVE "base_and_descendants" 
      AS (
          SELECT "geo_node_namespace_links"."namespace_id" 
          AS id 
          FROM "geo_node_namespace_links" 
          WHERE "geo_node_namespace_links"."geo_node_id" = 100109)
          SELECT "namespaces"."id" 
          FROM "namespaces", "base_and_descendants" 
          WHERE "namespaces"."parent_id" = "base_and_descendants"."id")) 
      SELECT "id" 
      FROM "base_and_descendants" 
      AS "namespaces"))) 
SELECT "packages_package_files".* 
FROM "restricted_packages" 
INNER JOIN "packages_package_files" 
ON "restricted_packages"."id" = "packages_package_files"."package_id" 
WHERE "packages_package_files"."file_store" = 1

Plan with index added:

Plan on namespaces with a lot of packages (they both have at least one project with 15k package files or more): (index not added in this explain)



➜  gitlab git:(mk/backfill-local-only-by-default) ✗ bin/rake db:migrate 
== 20200715202659 AddIndexOnPackageFilesFileStore: migrating ==================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
   -> 0.0027s
-- add_index(:packages_package_files, :file_store, {:algorithm=>:concurrently})
   -> 0.0063s
== 20200715202659 AddIndexOnPackageFilesFileStore: migrated (0.0094s) =========


➜  gitlab git:(mk/backfill-local-only-by-default) bin/rake db:rollback  
== 20200715202659 AddIndexOnPackageFilesFileStore: reverting ==================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
   -> 0.0032s
-- remove_index(:packages_package_files, {:algorithm=>:concurrently, :column=>:file_store})
   -> 0.0030s
== 20200715202659 AddIndexOnPackageFilesFileStore: reverted (0.0065s) =========





The Package files progress bar would instead say "Nothing to sync", and the Total in the popover would be 0.

