Geo: Fix package file backfill with sync object storage disabled
What does this MR do?
When "Sync object storage" is disabled (which is the default):
- Preexisting package files will no longer be backfilled if they are object stored
- Package files (and especially their registry records) will be automatically removed if they are object stored (the same worker that backfills does this:
RegistryConsistencyWorker
)
E.g. on staging, the Package Files failures would go away.
Also adds an index on file_store
column to improve these queries when "Sync object storage" is disabled.
Part of #224634 (closed)
Database queries
with sync object storage disabled
SELECT "packages_package_files".* FROM "packages_package_files" WHERE "packages_package_files"."file_store" = 1
with selective sync by namespace and sync object storage disabled
WITH "restricted_packages"
AS (
SELECT "packages_packages"."id"
FROM "packages_packages"
WHERE "packages_packages"."project_id"
IN (
SELECT "projects"."id"
FROM "projects"
WHERE "projects"."namespace_id"
IN (
WITH RECURSIVE "base_and_descendants"
AS (
(
SELECT "geo_node_namespace_links"."namespace_id"
AS id
FROM "geo_node_namespace_links"
WHERE "geo_node_namespace_links"."geo_node_id" = 100109)
UNION
(
SELECT "namespaces"."id"
FROM "namespaces", "base_and_descendants"
WHERE "namespaces"."parent_id" = "base_and_descendants"."id"))
SELECT "id"
FROM "base_and_descendants"
AS "namespaces")))
SELECT "packages_package_files".*
FROM "restricted_packages"
INNER JOIN "packages_package_files"
ON "restricted_packages"."id" = "packages_package_files"."package_id"
WHERE "packages_package_files"."file_store" = 1
Plan with index added: https://explain.depesz.com/s/Gvv5
Plan on namespaces with a lot of packages (they both have at least one project with 15k package files or more): https://explain.depesz.com/s/zAX8 (index not added in this explain)
Migration
Up:
➜ gitlab git:(mk/backfill-local-only-by-default) ✗ bin/rake db:migrate
== 20200715202659 AddIndexOnPackageFilesFileStore: migrating ==================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0027s
-- add_index(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0063s
== 20200715202659 AddIndexOnPackageFilesFileStore: migrated (0.0094s) =========
Down:
➜ gitlab git:(mk/backfill-local-only-by-default) bin/rake db:rollback
== 20200715202659 AddIndexOnPackageFilesFileStore: reverting ==================
-- transaction_open?()
-> 0.0000s
-- index_exists?(:packages_package_files, :file_store, {:algorithm=>:concurrently})
-> 0.0032s
-- remove_index(:packages_package_files, {:algorithm=>:concurrently, :column=>:file_store})
-> 0.0030s
== 20200715202659 AddIndexOnPackageFilesFileStore: reverted (0.0065s) =========
Screenshots
Before
After
The Package files progress bar would instead say "Nothing to sync", and the Total in the popover would be 0
.
Does this MR meet the acceptance criteria?
Conformity
-
Changelog entry -
Documentation (if required) -
Code review guidelines -
Merge request performance guidelines -
Style guides -
Database guides -
Separation of EE specific content
Availability and Testing
-
Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Edited by Michael Kozono