Geo: Fix package file backfill with sync object storage disabled
What does this MR do?
When "Sync object storage" is disabled (which is the default):
- Preexisting package files will no longer be backfilled if they are object stored
- Package files (and especially their registry records) will be automatically removed if they are object stored (the same worker that backfills does this:
RegistryConsistencyWorker)
E.g. on staging, the Package Files failures would go away.
Also adds an index on file_store column to improve these queries when "Sync object storage" is disabled.
Part of #224634 (closed)
Database queries
with sync object storage disabled
SELECT "packages_package_files".* FROM "packages_package_files" WHERE "packages_package_files"."file_store" = 1
with selective sync by namespace and sync object storage disabled
WITH "restricted_packages"
AS (
SELECT "packages_packages"."id"
FROM "packages_packages"
WHERE "packages_packages"."project_id"
IN (
SELECT "projects"."id"
FROM "projects"
WHERE "projects"."namespace_id"
IN (
WITH RECURSIVE "base_and_descendants"
AS (
(
SELECT "geo_node_namespace_links"."namespace_id"
AS id
FROM "geo_node_namespace_links"
WHERE "geo_node_namespace_links"."geo_node_id" = 100109)
UNION
(
SELECT "namespaces"."id"
FROM "namespaces", "base_and_descendants"
WHERE "namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT "id"
FROM "base_and_descendants"
AS "namespaces")))
SELECT "packages_package_files".*
FROM "restricted_packages"
INNER JOIN "packages_package_files"
ON "restricted_packages"."id" = "packages_package_files"."package_id"
WHERE "packages_package_files"."file_store" = 1
Plan with index added: https://explain.depesz.com/s/Gvv5
Plan on namespaces with a lot of packages (they both have at least one project with 15k package files or more): https://explain.depesz.com/s/zAX8 (index not added in this explain)
Migration
Up:
➜ gitlab git:(mk/backfill-local-only-by-default) ✗ bin/rake db:migrate
== 20200715202659 AddIndexOnPackageFilesFileStore: migrating ==================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0027s
-- add_index(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0063s
== 20200715202659 AddIndexOnPackageFilesFileStore: migrated (0.0094s) =========
Down:
➜ gitlab git:(mk/backfill-local-only-by-default) bin/rake db:rollback
== 20200715202659 AddIndexOnPackageFilesFileStore: reverting ==================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0032s
-- remove_index(:packages_package_files, {:algorithm=>:concurrently, :column=>:file_store})
-> 0.0030s
== 20200715202659 AddIndexOnPackageFilesFileStore: reverted (0.0065s) =========
Screenshots
Before
After
The Package files progress bar would instead say "Nothing to sync", and the Total in the popover would be 0.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Edited by Michael Kozono
