Skip to content

Deduplicating MR metrics table

Adam Hegyi requested to merge dedup-merge-request-metrics into master

What does this MR do?

Related issue: #214456 (closed)

This MR de-duplicates records in merge_request_metrics table. Merge Request has one Merge Request Metrics record.

We have about 511 records on PRD and it's very unlikely that we'll have more on self-hosted.

There is a small chance to create duplicated merge_request_metrics records when an MR is updated. Relevant code: https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/issuable.rb#L469

ensure_metrics is triggered via the after_save callback (even after update).

Possible cause:

When we import merge requests, we skip the callbacks to speed up the import process. This means the metrics data is simply not created. Some time later two BG job runs on the MR at the same time and creates the duplicated record. Probably our MR Refresh* jobs.

Note: the same issue occurs in the issue_metrics table.

Checking for duplicates

select count(*) from (select merge_request_id, count(merge_request_id) from merge_request_metrics group by merge_request_id having count(merge_request_id) > 1) merge_request_metrics;

Post Migration steps:

  1. Add a new, temporary unique index on merge_request_id where we have a condition on the last MR id (id > last_mr_metrics_id). This ensures that new records will be unique.
  2. Iterate over the merge_request_metrics table and merge the duplicated records.
  3. Add unique index on merge_request_id.
  4. Remove the temporary index.

Query: (1k batch)

SELECT "merge_request_metrics"."merge_request_id"
FROM "merge_request_metrics"
WHERE "merge_request_metrics"."merge_request_id" IN
    (SELECT "merge_request_metrics"."merge_request_id"
     FROM "merge_request_metrics"
     WHERE "merge_request_metrics"."id" >= 15925818
       AND "merge_request_metrics"."id" < 15926834)
GROUP BY "merge_request_metrics"."merge_request_id"
HAVING (COUNT(merge_request_metrics.merge_request_id) > 1)

Plan

Expected runtime: 2-5 minutes

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Adam Hegyi

Merge request reports