Loose Foreign Keys is falling behind while processing CI deleted records
Right now the partitions on the LFK table on ci looks like this
Partitioned table "public.loose_foreign_keys_deleted_records"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------------------------+--------------------------+-----------+----------+----------------------------------------------------------------+----------+-------------+--------------+-------------
id | bigint | | not null | nextval('loose_foreign_keys_deleted_records_id_seq'::regclass) | plain | | |
partition | bigint | | not null | 1321 | plain | | |
primary_key_value | bigint | | not null | | plain | | |
status | smallint | | not null | 1 | plain | | |
created_at | timestamp with time zone | | not null | now() | plain | | |
fully_qualified_table_name | text | | not null | | extended | | |
consume_after | timestamp with time zone | | | now() | plain | | |
cleanup_attempts | smallint | | | 0 | plain | | |
Partition key: LIST (partition)
Indexes:
"loose_foreign_keys_deleted_records_pkey" PRIMARY KEY, btree (partition, id)
"index_loose_foreign_keys_deleted_records_for_partitioned_query" btree (partition, fully_qualified_table_name, consume_after, id) WHERE status = 1
Check constraints:
"check_1a541f3235" CHECK (char_length(fully_qualified_table_name) <= 150)
Partitions: gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1296 FOR VALUES IN ('1296'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1297 FOR VALUES IN ('1297'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1298 FOR VALUES IN ('1298'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1299 FOR VALUES IN ('1299'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1300 FOR VALUES IN ('1300'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1301 FOR VALUES IN ('1301'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1302 FOR VALUES IN ('1302'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1303 FOR VALUES IN ('1303'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1304 FOR VALUES IN ('1304'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1305 FOR VALUES IN ('1305'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1306 FOR VALUES IN ('1306'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1307 FOR VALUES IN ('1307'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1308 FOR VALUES IN ('1308'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1309 FOR VALUES IN ('1309'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1310 FOR VALUES IN ('1310'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1311 FOR VALUES IN ('1311'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1312 FOR VALUES IN ('1312'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1313 FOR VALUES IN ('1313'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1314 FOR VALUES IN ('1314'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1315 FOR VALUES IN ('1315'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1316 FOR VALUES IN ('1316'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1317 FOR VALUES IN ('1317'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1318 FOR VALUES IN ('1318'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1319 FOR VALUES IN ('1319'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1320 FOR VALUES IN ('1320'),
gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_1321 FOR VALUES IN ('1321')
The number of records are close to
gitlabhq_dblab=# select count(*), fully_qualified_table_name from loose_foreign_keys_deleted_records where status = 1 group by fully_qualified_table_name order by count desc;
count | fully_qualified_table_name
-----------+------------------------------
116812985 | public.p_ci_builds
13205 | public.p_ci_pipelines
289 | public.ci_runner_machines
47 | public.ci_runners
44 | public.ci_pipeline_artifacts
4 | public.ci_triggers
It's worth mentioning that we have enabled the turbo mode for CI LFK. https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/feature_flags/ops/loose_foreign_keys_turbo_mode_ci.yml. So the Worker is allowed to spend more time to process the CI records.
Proposed solution
Similar to !200333 (merged), we can create another dedicated worker only to process p_ci_pipelines and p_ci_builds. And make sure that this worker runs on the CI::ApplicationRecord database.
But we also need to add specs to make sure each parent table is processed by 1 worker only. Because workers process delete the records and mark them as processed. Context here
Edited by 🤖 GitLab Bot 🤖