Skip to content

Add migration to cleanup web_hook_logs backfill

What does this MR do?

Related issue: #323675 (closed)

Migration to call the partitioning migration helper that cleans up after the background migration used to backfill the web_hook_logs table.

Summary from comments in #323675 (closed) and &5558 (comment 538648286):

On GitLab.com, 10691 background jobs were completed successfully and 338 background jobs are not marked as completed:

SELECT min(created_at), max(created_at), min(updated_at), max(updated_at) 
FROM background_migration_jobs where arguments ? 'web_hook_logs';

              min              |              max              |             min              |              max
-------------------------------+-------------------------------+------------------------------+-------------------------------
 2021-03-10 19:17:40.001109+00 | 2021-03-10 19:21:40.654249+00 | 2021-03-10 19:17:41.94364+00 | 2021-03-26 03:01:39.943296+00

SELECT status, count(*) FROM background_migration_jobs WHERE arguments ? 'web_hook_logs' group by 1;

 status | count
--------+-------
      0 |   338
      1 | 10691

The last background migration job run 9 hours ago, so we can assume the backfilling migration as completed with 338 jobs failed, not scheduled (lost?) or not marked as completed even though they run.

I can see no errors during the past 7 days in kibana other than a pgbouncer cannot connect to server for a specific job, so there seem to be no errors related to database timeouts.

Checking the db_duration_s of the Background Migration jobs for web_hook_logs in kibana, there are a few jobs that go up to 1 minute of execution time (for 25 batches), but the average time is 7.4 seconds per job and 4.12 seconds of total db execution time.

I think that we can assume that 8 seconds per job is a safe maximum, which means that 338 jobs will require less than an hour to run (= 45 minutes = 2704 seconds), so I am moving forward with submitting this MR that runs finalize_backfilling_partitioned_table in a post deployment migration.

Migration Output

up
$ bundle exec rake db:migrate
== 20210326121537 BackfillCleanupForPartitionedWebHookLogs: migrating =========
-- transaction_open?()
   -> 0.0000s
-- table_exists?("web_hook_logs_part_0c5294f417")
   -> 0.0007s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- execute("VACUUM FREEZE ANALYZE web_hook_logs_part_0c5294f417")
   -> 0.0180s
-- execute("RESET ALL")
   -> 0.0006s
== 20210326121537 BackfillCleanupForPartitionedWebHookLogs: migrated (0.0378s)
down
$ bundle exec rake db:rollback
== 20210326121537 BackfillCleanupForPartitionedWebHookLogs: reverting =========
== 20210326121537 BackfillCleanupForPartitionedWebHookLogs: reverted (0.0000s)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Yannis Roussos

Merge request reports