Skip to content

Geo - Start to backfill the new verification details tables

Problem

We need to split the verification details for the replicables using the same table to store the verification stat (4) into a separate tables. With these changes, the Geo::VerificationStateBackfillWorker that iterates over the table corresponding to the replicable to backfill the corresponding verification state table will also perform the cleanup of records when we change the selective sync scope to a new set of organizations.

Replicable Models using the same table to store the verification state:

  • Model: Ci::PipelineArtifact
    • Immutable
    • Schema: gitlab_ci
    • Table ci_pipeline_artifacts
  • Model: Packages::PackageFile
  • Model: SnippetRepository
    • Mutable
    • Schema: gitlab_main
    • Table: snippet_repositories
  • Model: Terraform::StateVersion
    • Immutable
    • Schema: gitlab_main
    • Table: terraform_state_versions

Proposal

  1. Create the four verification state tables in regular schema migrations - #516947 (closed)

  2. Update the application to calculate the checksum to read from the original column for the immutable data types (Ci::PipelineArtifact, Packages::PackageFile, Terraform::StateVersion) during Geo's usual verification processes to backfill the separate tables. This avoids resource usage (CPU, network, etc) for immutable data types during the backfill phase.

  3. Let Geo's usual verification processes backfill the separate tables for mutable data types (SnippetRepository). There is only one, and it is not a typical large table. This prevents us from introducing some bugs we experienced in the past, like #387980 (closed).

Edited by 🤖 GitLab Bot 🤖