Skip to content

Geo: Automatically verify Packages on secondaries

Michael Kozono requested to merge mk/secondary-side-verification into master

What does this MR do?

A few commits were extracted to !51242 (merged) to make this one more reviewable, so this MR is branched off that one until it's in master.

This MR verifies synced PackageFileRegistry records, which are pending verification or have failed verification, on Geo secondaries, in the background.

"verification_succeeded" on the secondary means that a checksum was calculated and it matches the primary's stored checksum.

Resolves #13981 (closed)

Database migration output

Up:

➜  gitlab git:(mk/secondary-side-verification) ✗ bin/rake geo:db:migrate        
== 20201208025254 AddVerificationFieldsToPackageFileRegistry: migrating =======
-- add_column(:package_file_registry, :verification_state, :integer, {:default=>0, :limit=>2, :null=>false})
   -> 0.0033s
-- add_column(:package_file_registry, :verification_started_at, :datetime_with_timezone)
   -> 0.0007s
== 20201208025254 AddVerificationFieldsToPackageFileRegistry: migrated (0.0041s) 

== 20201208031224 AddVerificationIndexesToPackageFileRegistry: migrating ======
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:package_file_registry, :verified_at, {:where=>"(state = 2 AND verification_state = 0)", :order=>{:verified_at=>"ASC NULLS FIRST"}, :name=>"package_file_registry_pending_verification", :algorithm=>:concurrently})
   -> 0.0024s
-- add_index(:package_file_registry, :verified_at, {:where=>"(state = 2 AND verification_state = 0)", :order=>{:verified_at=>"ASC NULLS FIRST"}, :name=>"package_file_registry_pending_verification", :algorithm=>:concurrently})
   -> 0.0032s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:package_file_registry, :verification_retry_at, {:where=>"(state = 2 AND verification_state = 3)", :order=>{:verification_retry_at=>"ASC NULLS FIRST"}, :name=>"package_file_registry_failed_verification", :algorithm=>:concurrently})
   -> 0.0020s
-- add_index(:package_file_registry, :verification_retry_at, {:where=>"(state = 2 AND verification_state = 3)", :order=>{:verification_retry_at=>"ASC NULLS FIRST"}, :name=>"package_file_registry_failed_verification", :algorithm=>:concurrently})
   -> 0.0018s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:package_file_registry, :verification_state, {:where=>"(state = 2 AND (verification_state IN (0, 3)))", :name=>"package_file_registry_needs_verification", :algorithm=>:concurrently})
   -> 0.0024s
-- add_index(:package_file_registry, :verification_state, {:where=>"(state = 2 AND (verification_state IN (0, 3)))", :name=>"package_file_registry_needs_verification", :algorithm=>:concurrently})
   -> 0.0028s
== 20201208031224 AddVerificationIndexesToPackageFileRegistry: migrated (0.0179s) 

Down:

➜  gitlab git:(mk/secondary-side-verification) ✗ bin/rake geo:db:rollback STEP=2                    
== 20201208031224 AddVerificationIndexesToPackageFileRegistry: reverting ======
-- transaction_open?()
   -> 0.0000s
-- indexes(:package_file_registry)
   -> 0.0036s
-- remove_index(:package_file_registry, {:algorithm=>:concurrently, :name=>"package_file_registry_pending_verification"})
   -> 0.0016s
-- transaction_open?()
   -> 0.0000s
-- indexes(:package_file_registry)
   -> 0.0023s
-- remove_index(:package_file_registry, {:algorithm=>:concurrently, :name=>"package_file_registry_failed_verification"})
   -> 0.0007s
-- transaction_open?()
   -> 0.0000s
-- indexes(:package_file_registry)
   -> 0.0017s
-- remove_index(:package_file_registry, {:algorithm=>:concurrently, :name=>"package_file_registry_needs_verification"})
   -> 0.0007s
== 20201208031224 AddVerificationIndexesToPackageFileRegistry: reverted (0.0115s) 

== 20201208025254 AddVerificationFieldsToPackageFileRegistry: reverting =======
-- remove_column(:package_file_registry, :verification_started_at, :datetime_with_timezone)
   -> 0.0010s
-- remove_column(:package_file_registry, :verification_state, :integer, {:default=>0, :limit=>2, :null=>false})
   -> 0.0010s
== 20201208025254 AddVerificationFieldsToPackageFileRegistry: reverted (0.0056s) 

Query plans

Methodology for generating 2.1M rows of test data https://gitlab.com/-/snippets/2057444

PackageFileRegistry.verification_pending_batch(batch_size: 10)

UPDATE 
  package_file_registry 
SET 
  "verification_state" = 1, 
  "verification_started_at" = NOW() 
WHERE 
  package_file_id IN (
    SELECT 
      "package_file_registry"."package_file_id" 
    FROM 
      "package_file_registry" 
    WHERE 
      (
        "package_file_registry"."verification_state" IN (0)
      ) 
      AND (
        "package_file_registry"."state" IN (2)
      ) 
    ORDER BY 
      verified_at ASC NULLS FIRST 
    LIMIT 
      10
  ) RETURNING package_file_id;
                                                                                                            QUERY PLAN                                                                                                             
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on package_file_registry  (cost=5.30..89.57 rows=10 width=1202) (actual time=2.256..2.394 rows=10 loops=1)
   ->  Nested Loop  (cost=5.30..89.57 rows=10 width=1202) (actual time=1.224..1.256 rows=10 loops=1)
         ->  HashAggregate  (cost=4.87..4.97 rows=10 width=32) (actual time=0.758..0.762 rows=10 loops=1)
               Group Key: "ANY_subquery".package_file_id
               ->  Subquery Scan on "ANY_subquery"  (cost=0.42..4.85 rows=10 width=32) (actual time=0.716..0.735 rows=10 loops=1)
                     ->  Limit  (cost=0.42..4.75 rows=10 width=12) (actual time=0.660..0.676 rows=10 loops=1)
                           ->  Index Scan using package_file_registry_pending_verification on package_file_registry package_file_registry_1  (cost=0.42..86600.83 rows=200141 width=12) (actual time=0.659..0.674 rows=10 loops=1)
         ->  Index Scan using index_package_file_registry_on_repository_id on package_file_registry  (cost=0.43..8.45 rows=1 width=1164) (actual time=0.048..0.048 rows=1 loops=10)
               Index Cond: (package_file_id = "ANY_subquery".package_file_id)
 Planning Time: 1.826 ms
 Execution Time: 2.580 ms
(11 rows)

PackageFileRegistry.verification_failed_batch(batch_size: 10)

UPDATE 
  package_file_registry 
SET 
  "verification_state" = 1, 
  "verification_started_at" = NOW() 
WHERE 
  package_file_id IN (
    SELECT 
      "package_file_registry"."package_file_id" 
    FROM 
      "package_file_registry" 
    WHERE 
      (
        "package_file_registry"."verification_state" IN (3)
      ) 
      AND (
        "package_file_registry"."state" IN (2)
      ) 
    ORDER BY 
      verification_retry_at ASC NULLS FIRST 
    LIMIT 
      10
  ) RETURNING package_file_id;
                                                                                                            QUERY PLAN                                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Update on package_file_registry  (cost=5.37..89.64 rows=10 width=1202) (actual time=1.500..4.135 rows=10 loops=1)
   ->  Nested Loop  (cost=5.37..89.64 rows=10 width=1202) (actual time=0.624..0.829 rows=10 loops=1)
         ->  HashAggregate  (cost=4.94..5.04 rows=10 width=32) (actual time=0.195..0.206 rows=10 loops=1)
               Group Key: "ANY_subquery".package_file_id
               ->  Subquery Scan on "ANY_subquery"  (cost=0.42..4.92 rows=10 width=32) (actual time=0.060..0.183 rows=10 loops=1)
                     ->  Limit  (cost=0.42..4.82 rows=10 width=12) (actual time=0.054..0.171 rows=10 loops=1)
                           ->  Index Scan using package_file_registry_failed_verification on package_file_registry package_file_registry_1  (cost=0.42..86539.65 rows=196903 width=12) (actual time=0.053..0.167 rows=10 loops=1)
         ->  Index Scan using index_package_file_registry_on_repository_id on package_file_registry  (cost=0.43..8.45 rows=1 width=1164) (actual time=0.059..0.059 rows=1 loops=10)
               Index Cond: (package_file_id = "ANY_subquery".package_file_id)
 Planning Time: 0.383 ms
 Execution Time: 4.231 ms
(11 rows)

PackageFileRegistry.needs_verification_count(limit: 1000)

SELECT 
  COUNT(*) 
FROM 
  (
    SELECT 
      1 AS one 
    FROM 
      "package_file_registry" 
    WHERE 
      (
        "package_file_registry"."verification_state" IN (0, 3)
      ) 
      AND (
        "package_file_registry"."state" IN (2)
      ) 
    LIMIT 
      1000
  ) subquery_for_count;
                                                              QUERY PLAN                                                               
---------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=142.72..142.73 rows=1 width=8) (actual time=0.565..0.566 rows=1 loops=1)
   ->  Limit  (cost=0.00..130.22 rows=1000 width=4) (actual time=0.088..0.475 rows=1000 loops=1)
         ->  Seq Scan on package_file_registry  (cost=0.00..51704.25 rows=397044 width=4) (actual time=0.087..0.335 rows=1000 loops=1)
               Filter: ((verification_state = ANY ('{0,3}'::integer[])) AND (state = 2))
               Rows Removed by Filter: 446
 Planning Time: 0.414 ms
 Execution Time: 0.610 ms
(7 rows)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Edited by Michael Kozono

Merge request reports