Skip to content
Snippets Groups Projects
Verified Commit bc4b7118 authored by Shubham Kumar's avatar Shubham Kumar :four: Committed by GitLab
Browse files

Add and backfill project_id for ci_job_artifact_states

## What does this MR do and why?

Add and backfill project_id for ci_job_artifact_states.

This table has a
[desired sharding key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#define-a-desired_sharding_key-to-automatically-backfill-a-sharding_key)
configured ([view configuration](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/ci_job_artifact_states.yml)).

This merge request is the first step towards transforming the desired sharding key into a
[sharding key](https://docs.gitlab.com/ee/development/database/multiple_databases.html#defining-a-sharding-key-for-all-cell-local-tables).

This involves the following changes:

- Adding a new column that will serve as the sharding key (along with the relevant asynchronous index).
- Populating the sharding key when new records are created by adding a database function and trigger.
- Scheduling a [batched background migration](https://docs.gitlab.com/ee/development/database/batched_background_migrations.html)
  to set the sharding key for existing records.

Once the background migration has completed, a second merge request will be created to finalize the background
migration, and add a foreign key and not null constraint.

## How to verify

We have assigned a random backend engineer from ~"group::geo" to review these changes. Please review this merge
request from a ~backend perspective. The main thing we are looking to verify is that the added column and association
match the values specified by the [desired sharding key](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/docs/ci_job_artifact_states.yml)
configuration and that backfilling the column from this other table makes sense in the context of this feature.

When you are finished, please:

1. Trigger the [database testing pipeline](https://docs.gitlab.com/ee/development/database/database_migration_pipeline.html)
   as instructed by Danger.
1. Request a review from the ~backend maintainer and ~database reviewer suggested by Danger.

If you have any questions or concerns, reach out to @tigerwnz or @shubhamkrai.

This merge request was generated by a once off keep implemented in
!143774

This change was generated by
[gitlab-housekeeper](https://gitlab.com/gitlab-org/gitlab/-/tree/master/gems/gitlab-housekeeper)
using the Keeps::BackfillDesiredShardingKeyLargeTable keep.

To provide feedback on your experience with `gitlab-housekeeper` please create an issue with the
label ~"GitLab Housekeeper" and consider pinging the author of this keep.

Changelog: other
parent 27995886
No related branches found
No related tags found
2 merge requests!170053Security patch upgrade alert: Only expose to admins 17-4,!165940Add and backfill project_id for ci_job_artifact_states
Showing
with 184 additions and 1 deletion
---
migration_job_name: BackfillCiJobArtifactStatesProjectId
description: Backfills sharding key `ci_job_artifact_states.project_id` from `p_ci_job_artifacts`.
feature_category: geo_replication
introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/165940
milestone: '17.5'
queued_migration_version: 20240912122440
finalize_after: '2024-10-22'
finalized_by: # version of the migration that finalized this BBM
......@@ -17,3 +17,4 @@ desired_sharding_key:
table: p_ci_job_artifacts
sharding_key: project_id
belongs_to: job_artifact
desired_sharding_key_migration_job_name: BackfillCiJobArtifactStatesProjectId
# frozen_string_literal: true
class AddProjectIdToCiJobArtifactStates < Gitlab::Database::Migration[2.2]
milestone '17.5'
def change
add_column :ci_job_artifact_states, :project_id, :bigint
end
end
# frozen_string_literal: true
class PrepareIndexCiJobArtifactStatesOnProjectId < Gitlab::Database::Migration[2.2]
milestone '17.5'
disable_ddl_transaction!
INDEX_NAME = 'index_ci_job_artifact_states_on_project_id'
def up
prepare_async_index :ci_job_artifact_states, :project_id, name: INDEX_NAME
end
def down
unprepare_async_index :ci_job_artifact_states, INDEX_NAME
end
end
# frozen_string_literal: true
class AddCiJobArtifactStatesProjectIdTrigger < Gitlab::Database::Migration[2.2]
milestone '17.5'
def up
install_sharding_key_assignment_trigger(
table: :ci_job_artifact_states,
sharding_key: :project_id,
parent_table: :p_ci_job_artifacts,
parent_sharding_key: :project_id,
foreign_key: :job_artifact_id
)
end
def down
remove_sharding_key_assignment_trigger(
table: :ci_job_artifact_states,
sharding_key: :project_id,
parent_table: :p_ci_job_artifacts,
parent_sharding_key: :project_id,
foreign_key: :job_artifact_id
)
end
end
# frozen_string_literal: true
class QueueBackfillCiJobArtifactStatesProjectId < Gitlab::Database::Migration[2.2]
milestone '17.5'
restrict_gitlab_migration gitlab_schema: :gitlab_ci
MIGRATION = "BackfillCiJobArtifactStatesProjectId"
DELAY_INTERVAL = 2.minutes
BATCH_SIZE = 10000
SUB_BATCH_SIZE = 1000
def up
queue_batched_background_migration(
MIGRATION,
:ci_job_artifact_states,
:job_artifact_id,
:project_id,
:p_ci_job_artifacts,
:project_id,
:job_artifact_id,
job_interval: DELAY_INTERVAL,
batch_size: BATCH_SIZE,
sub_batch_size: SUB_BATCH_SIZE
)
end
def down
delete_batched_background_migration(
MIGRATION,
:ci_job_artifact_states,
:job_artifact_id,
[
:project_id,
:p_ci_job_artifacts,
:project_id,
:job_artifact_id
]
)
end
end
63488d8393568e7e93915f351710ba917bee9dae7b73c1bfae0a5a85ce42ec3d
\ No newline at end of file
92484a5cbcf38847998b38a95ce28102252b4ba25fd56cb6b092b00b0e4c48bf
\ No newline at end of file
26df32dffad29ac5ce904b46ea706d36c8f1c15dd76d85a0fa9a19228f6d8c94
\ No newline at end of file
4f8d4d2a2dc5e118366c044b82d477602b9a842c3eaae68072b7e4b3c13be8fd
\ No newline at end of file
......@@ -1833,6 +1833,22 @@ RETURN NEW;
END
$$;
 
CREATE FUNCTION trigger_a465de38164e() RETURNS trigger
LANGUAGE plpgsql
AS $$
BEGIN
IF NEW."project_id" IS NULL THEN
SELECT "project_id"
INTO NEW."project_id"
FROM "p_ci_job_artifacts"
WHERE "p_ci_job_artifacts"."id" = NEW."job_artifact_id";
END IF;
RETURN NEW;
END
$$;
CREATE FUNCTION trigger_a4e4fb2451d9() RETURNS trigger
LANGUAGE plpgsql
AS $$
......@@ -8100,6 +8116,7 @@ CREATE TABLE ci_job_artifact_states (
verification_checksum bytea,
verification_failure text,
partition_id bigint NOT NULL,
project_id bigint,
CONSTRAINT check_df832b66ea CHECK ((char_length(verification_failure) <= 255))
);
 
......@@ -33040,6 +33057,8 @@ CREATE TRIGGER trigger_a1bc7c70cbdf BEFORE INSERT OR UPDATE ON vulnerability_use
 
CREATE TRIGGER trigger_a253cb3cacdf BEFORE INSERT OR UPDATE ON dora_daily_metrics FOR EACH ROW EXECUTE FUNCTION trigger_a253cb3cacdf();
 
CREATE TRIGGER trigger_a465de38164e BEFORE INSERT OR UPDATE ON ci_job_artifact_states FOR EACH ROW EXECUTE FUNCTION trigger_a465de38164e();
CREATE TRIGGER trigger_a4e4fb2451d9 BEFORE INSERT OR UPDATE ON epic_user_mentions FOR EACH ROW EXECUTE FUNCTION trigger_a4e4fb2451d9();
 
CREATE TRIGGER trigger_a7e0fb195210 BEFORE INSERT OR UPDATE ON vulnerability_finding_evidences FOR EACH ROW EXECUTE FUNCTION trigger_a7e0fb195210();
# frozen_string_literal: true
module Gitlab
module BackgroundMigration
class BackfillCiJobArtifactStatesProjectId < BackfillDesiredShardingKeyJob
operation_name :backfill_ci_job_artifact_states_project_id
feature_category :geo_replication
end
end
end
......@@ -95,7 +95,7 @@
ci_sources_projects: %w[partition_id],
ci_stages: %w[partition_id project_id pipeline_id],
ci_trigger_requests: %w[commit_id],
ci_job_artifact_states: %w[partition_id],
ci_job_artifact_states: %w[partition_id project_id],
cluster_providers_aws: %w[security_group_id vpc_id access_key_id],
cluster_providers_gcp: %w[gcp_project_id operation_id],
compliance_management_frameworks: %w[group_id],
......
# frozen_string_literal: true
require 'spec_helper'
RSpec.describe Gitlab::BackgroundMigration::BackfillCiJobArtifactStatesProjectId,
feature_category: :geo_replication,
schema: 20240912122437,
migration: :gitlab_ci do
include_examples 'desired sharding key backfill job' do
let(:batch_table) { :ci_job_artifact_states }
let(:backfill_column) { :project_id }
let(:batch_column) { :job_artifact_id }
let(:backfill_via_table) { :p_ci_job_artifacts }
let(:backfill_via_column) { :project_id }
let(:backfill_via_foreign_key) { :job_artifact_id }
end
end
# frozen_string_literal: true
require 'spec_helper'
require_migration!
RSpec.describe QueueBackfillCiJobArtifactStatesProjectId, migration: :gitlab_ci, feature_category: :geo_replication do
let!(:batched_migration) { described_class::MIGRATION }
it 'schedules a new batched migration' do
reversible_migration do |migration|
migration.before -> {
expect(batched_migration).not_to have_scheduled_batched_migration
}
migration.after -> {
expect(batched_migration).to have_scheduled_batched_migration(
table_name: :ci_job_artifact_states,
column_name: :job_artifact_id,
interval: described_class::DELAY_INTERVAL,
batch_size: described_class::BATCH_SIZE,
sub_batch_size: described_class::SUB_BATCH_SIZE,
gitlab_schema: :gitlab_ci,
job_arguments: [
:project_id,
:p_ci_job_artifacts,
:project_id,
:job_artifact_id
]
)
}
end
end
end
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment