Create new Geo verification state table for Terraform::StateVersion

What does this MR do and why?

Geo functionality requires us to track the state of replicated data types - done with separate state tables which include the verification data, like upload_states for our uploads and wiki_repository_states for our wiki_repositories tables for example.

Some models - Ci::PipelineArtifact, Packages::PackageFile, SnippetRepository and Terraform::StateVersion never had the verification attributes split out into separate state tables, so contain verification data in the parent model themselves.

This MR is the first iteration to address this issue by creating the new verification tables and models.

This MR is not intended to change any existing functionality and the tables should remain unused until following MRs where we will backfill these new tables.

Summary of changes -

  • For TerraformStateVersionState we've created the following:-
    • migration to create the new table, including indexes copying existing state table indexes
    • model
    • db/docs/*.yml entry
    • model spec
    • factory

Relates to #520704 (closed)

Changelog: changed

EE: true

Query plan

Original Query Plan

Query 1: verification_pending_batch

Before - terraform_state_versions table
UPDATE terraform_state_versions 
SET verification_state = 1, verification_started_at = NOW() 
WHERE id IN (
    SELECT id FROM terraform_state_versions 
    WHERE verification_state IN (0) 
    ORDER BY verified_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING id;

Query Plan

After - Using separate terraform_state_version_states table
UPDATE terraform_state_version_states 
SET verification_state = 1, verification_started_at = NOW() 
WHERE terraform_state_version_id IN (
    SELECT terraform_state_version_id 
    FROM terraform_state_versions 
    INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
    WHERE terraform_state_version_states.verification_state = 0 
    ORDER BY terraform_state_version_states.verified_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING terraform_state_version_id;

Query Plan

Reference - Using upload_states table
UPDATE upload_states 
SET verification_state = 1, verification_started_at = NOW() 
WHERE upload_id IN (
    SELECT upload_id 
    FROM uploads 
    INNER JOIN upload_states ON upload_states.upload_id = uploads.id
    WHERE upload_states.verification_state = 0 
    ORDER BY upload_states.verified_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING upload_id;

Query 2: verification_failed_batch

Before - terraform_state_versions table
UPDATE terraform_state_versions 
SET verification_state = 1, verification_started_at = NOW() 
WHERE id IN (
    SELECT id FROM terraform_state_versions 
    WHERE verification_state IN (3)
    AND (verification_retry_at IS NULL OR verification_retry_at < NOW())
    ORDER BY verification_retry_at ASC NULLS FIRST 
    LIMIT 10 FOR UPDATE SKIP LOCKED
) RETURNING id;

Query Plan

After - Using separate terraform_state_version_states table
UPDATE terraform_state_version_states 
SET verification_state = 1, verification_started_at = NOW() 
WHERE terraform_state_version_id IN (
    SELECT terraform_state_version_id 
    FROM terraform_state_versions 
    INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
    WHERE terraform_state_version_states.verification_state = 3
    AND (terraform_state_version_states.verification_retry_at IS NULL 
         OR terraform_state_version_states.verification_retry_at < NOW())
    ORDER BY terraform_state_version_states.verification_retry_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING terraform_state_version_id;

Query Plan

Reference - Using upload_states table
UPDATE upload_states 
SET verification_state = 1, verification_started_at = NOW() 
WHERE upload_id IN (
    SELECT upload_id 
    FROM uploads 
    INNER JOIN upload_states ON upload_states.upload_id = uploads.id 
    WHERE upload_states.verification_state = 3
    AND (upload_states.verification_retry_at IS NULL 
         OR upload_states.verification_retry_at < NOW())
    ORDER BY upload_states.verification_retry_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING upload_id;

Query 3: fail_verification_timeouts

Before - terraform_state_versions table
SELECT FROM terraform_state_versions 
WHERE verification_state = 1 
AND verification_started_at < NOW() - INTERVAL '8 hours'
ORDER BY id ASC LIMIT 1000;

Query Plan

After - Using separate terraform_state_version_states table
SELECT terraform_state_versions.* 
FROM terraform_state_versions 
INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
WHERE terraform_state_version_states.verification_state = 1 
AND terraform_state_version_states.verification_started_at < NOW() - INTERVAL '8 hours'
ORDER BY terraform_state_versions.id ASC 
LIMIT 1000;

Query Plan

Reference - Using upload_states table
SELECT uploads.* 
FROM uploads 
INNER JOIN upload_states ON upload_states.upload_id = uploads.id
WHERE upload_states.verification_state = 1 
AND upload_states.verification_started_at < NOW() - INTERVAL '8 hours'
ORDER BY uploads.id ASC 
LIMIT 1000;

Follow-up Query plan - with additional query and new index after review

Follow-up Query Plan

Query 2: verification_failed_batch - now after running explain ANALYZE terraform_state_version_states

Before - terraform_state_versions table
explain UPDATE terraform_state_versions 
SET verification_state = 1, verification_started_at = NOW() 
WHERE id IN (
    SELECT id FROM terraform_state_versions 
    WHERE verification_state IN (3)
    AND (verification_retry_at IS NULL OR verification_retry_at < NOW())
    ORDER BY verification_retry_at ASC NULLS FIRST 
    LIMIT 10 FOR UPDATE SKIP LOCKED
) RETURNING id;

Query Plan

After - Using separate terraform_state_version_states table
explain UPDATE terraform_state_version_states 
SET verification_state = 1, verification_started_at = NOW() 
WHERE terraform_state_version_id IN (
    SELECT terraform_state_version_id 
    FROM terraform_state_versions 
    INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
    WHERE terraform_state_version_states.verification_state = 3
    AND (terraform_state_version_states.verification_retry_at IS NULL 
         OR terraform_state_version_states.verification_retry_at < NOW())
    ORDER BY terraform_state_version_states.verification_retry_at ASC NULLS FIRST 
    LIMIT 10 
    FOR UPDATE SKIP LOCKED
) RETURNING terraform_state_version_id;

Query Plan

Query 3: fail_verification_timeouts - now with new index index_terraform_state_version_states_started_verification

Before - terraform_state_versions table
explain SELECT FROM terraform_state_versions 
WHERE verification_state = 1 
AND verification_started_at < NOW() - INTERVAL '8 hours'
ORDER BY id ASC LIMIT 1000;

Query Plan

After - Using separate terraform_state_version_states table
explain SELECT terraform_state_versions.* 
FROM terraform_state_versions 
INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
WHERE terraform_state_version_states.verification_state = 1 
AND terraform_state_version_states.verification_started_at < NOW() - INTERVAL '8 hours'
ORDER BY terraform_state_versions.id ASC 
LIMIT 1000;

Query Plan

Query 4: needs_verification_count (additional query)

Before - terraform_state_versions table
-- Current table
explain SELECT COUNT(*) FROM (
  SELECT 1 AS one 
  FROM terraform_state_versions 
  WHERE (
    terraform_state_versions.verification_state IN (0)
    OR (
      terraform_state_versions.verification_state IN (3)
      AND (
        terraform_state_versions.verification_retry_at IS NULL 
        OR terraform_state_versions.verification_retry_at < NOW()
      )
    )
  ) 
  LIMIT 40
) subquery_for_count;

Query Plan

After - Using separate terraform_state_version_states table
-- New state table
explain SELECT COUNT(*) FROM (
  SELECT 1 AS one 
  FROM terraform_state_versions 
  INNER JOIN terraform_state_version_states ON terraform_state_version_states.terraform_state_version_id = terraform_state_versions.id
  WHERE (
    terraform_state_version_states.verification_state = 0 
    OR terraform_state_version_states.verification_state = 3 
    AND (
      terraform_state_version_states.verification_retry_at IS NULL 
      OR terraform_state_version_states.verification_retry_at < NOW()
    )
  )
  LIMIT 40
) subquery_for_count;

Query Plan

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

  1. Check the existence of the new tables by going into the DB console with bundle exec rails dbconsole
  2. Check the definition of each table using:
    • \d terraform_state_version_states

Related to #520704 (closed)

Edited by Scott Murray

Merge request reports

Loading