Fix backfill_p_ci_pipelines_trigger_id

What does this MR do and why?

This MR fixes the backfill for p_ci_pipelines.trigger_id:

Facts:

  • all the builds that have trigger_request_id values, their ci_trigger_requests have commit_id values. (see here)

    Click to expand
    select id 
    from ci_trigger_requests 
    join p_ci_builds on ci_trigger_requests.id = p_ci_builds.trigger_id
    where ci_trigger_requests.commit_id is null;
  • all the ci_trigger_requests rows that have commit_id values may or may not have associated builds. (see here)

    Click to expand
    select count(id)
    from ci_trigger_requests 
    where commit_id is not null
    and not exists (
      select id
      from p_ci_builds
      where trigger_request_id = ci_trigger_requests.id
    )
  • there are around 3505 ci_trigger_requests rows have no commit_id values created up between 2016-09-01 08:31:58.683822 and 2017-10-06 12:30:40.787156. (I guess they are related to some old feature)

    Click to expand
    gitlabhq_dblab=# select count(id) from ci_trigger_requests where commit_id is null;
     count
    -------
      3505
    (1 row)
    
    
    gitlabhq_dblab=# select min(created_at) from ci_trigger_requests where commit_id is null;
                min
    ----------------------------
     2016-09-01 08:31:58.683822
    (1 row)
    
    gitlabhq_dblab=# select max(created_at) from ci_trigger_requests where commit_id is null;
                max
    ----------------------------
     2017-10-06 12:30:40.787156
    (1 row)
  • there are 4 ci_trigger_requests rows that have same commit_id and trigger_id but different variables. (this should be invalid data and can be ignored)

    Click to expand
    gitlabhq_dblab=# SELECT commit_id, trigger_id
    gitlabhq_dblab-# FROM ci_trigger_requests
    gitlabhq_dblab-# WHERE commit_id IS NOT NULL
    gitlabhq_dblab-# GROUP BY commit_id, trigger_id
    gitlabhq_dblab-# HAVING count(commit_id) > 1;
     commit_id | trigger_id
    -----------+------------
       2936798 |       3825
    (1 row)
    
    gitlabhq_dblab=# SELECT id, commit_id, trigger_id, project_id, variables FROM ci_trigger_requests where commit_id = 2936798;
      id   | commit_id | trigger_id | project_id |      variables
    -------+-----------+------------+------------+---------------------
     15331 |   2936798 |       3825 |     154793 | ---                +
           |           |            |            | RUN_PUBLISH: 'true'+
           |           |            |            | DRY_RUN: 'true'    +
           |           |            |            |
     15333 |   2936798 |       3825 |     154793 | ---                +
           |           |            |            | RUN_PUBLISH: 'true'+
           |           |            |            | DRY_RUN: 'false'   +
           |           |            |            |
     15334 |   2936798 |       3825 |     154793 | ---                +
           |           |            |            | RUN_PUBLISH: 'true'+
           |           |            |            |
     15335 |   2936798 |       3825 |     154793 | ---                +
           |           |            |            | RUN_PUBLISH: 'true'+
           |           |            |            |
    (4 rows)

NOTE: commit_id above is a Rails (non-database) FK on ci_trigger_requests referring p_ci_pipelines, see https://gitlab.com/gitlab-org/gitlab/-/blob/502767-requeue-backfill-trigger-id-to-p-ci-pipelines/app/models/ci/trigger_request.rb#L6

Solution:

  • Fix previous incorrect backfill by joining p_ci_builds and ci_trigger_requests using commit_id, but set the value using a subquery to pull the correct ci_trigger_requests.trigger_id
  • Join p_ci_pipelines and ci_trigger_requests using commit_id and update p_ci_pipelines.trigger_id

Query plans

References

Please include cross links to any resources that are relevant to this MR. This will give reviewers and future readers helpful context to give an efficient review of the changes introduced.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Related to #502767 (closed)

Edited by Tianwen Chen

Merge request reports

Loading