Skip to content

Add cleaup migration for broken epic-issue links

What does this MR do?

Related to #276492

In this security MR we added the ability to recreate epics in a new group when a project is transferred if any issues were linked to a previous group's epic.

Any transfers previous to that MR resulted in inconsistent links that associate issues with epics from a different group. This MR schedules a background migration that will check if the issue's epic belongs to the same group as the issue's project (or an ancestor of this group), and if it doesn't, the EpicIssue link will be deleted.

Migration

Running migrations

20210223175130_delete_inconsistent_epic_issue_links

Migration up
bin/rails db:migrate:up VERSION=20210223175130
== 20210223175130 DeleteInconsistentEpicIssueLinks: migrating =================
== 20210223175130 DeleteInconsistentEpicIssueLinks: migrated (0.1033s) ========
Migration down
bin/rails db:migrate:down VERSION=20210223175130
== 20210223175130 DeleteInconsistentEpicIssueLinks: reverting =================
== 20210223175130 DeleteInconsistentEpicIssueLinks: reverted (0.0000s) ========

Queries

Counts

  • Number of groups to query: ~ 5134 (database-lab result)
SQL
explain SELECT DISTINCT "epics"."group_id" FROM "epics" INNER JOIN "epic_issues" ON "epic_issues"."epic_id" = "epics"."id"
EXPLAIN plan
HashAggregate  (cost=22089.33..22165.96 rows=7663 width=4) (actual time=3958.038..3959.736 rows=5134 loops=1)
   Group Key: epics.group_id
   Buffers: shared hit=164593 read=3121 dirtied=654
   I/O Timings: read=3002.748
   ->  Hash Join  (cost=12006.80..21210.24 rows=351635 width=4) (actual time=3086.802..3839.385 rows=341337 loops=1)
         Hash Cond: (epic_issues.epic_id = epic

database-lab link

  • Number of epic-issues to delete: TBD

Finding invalid links (batched by 50)

  • Querying group ids
SQL
explain SELECT DISTINCT "epics"."group_id" FROM "epics" INNER JOIN "epic_issues" ON "epic_issues"."epic_id" = "epics"."id" LIMIT 50
EXPLAIN plan
  Limit  (cost=0.84..425.29 rows=50 width=4) (actual time=7.476..5890.689 rows=50 loops=1)
   Buffers: shared hit=33258 read=5807 dirtied=429
   I/O Timings: read=5570.940
   ->  Unique  (cost=0.84..66096.31 rows=7786 width=4) (actual time=7.474..5890.650 rows=50 loops=1)
         Buffers: shared hit=33258 read=5807 dirtied=429
         I/O Timings: read=5570.940
         ->  Nested Loop  (cost=0.84..65223.53 rows=349113 width=4) (actual time=7.472..5881.220 rows=43369 loops=1)
               Buffers: shared hit=33258 read=5807 dirtied=429
               I/O Timings: read=5570.940
               ->  Index Scan using index_epics_on_group_id_and_iid on public.epics  (cost=0.42..11165.05 rows=80470 width=8) (actual time=4.100..4138.566 rows=7877 loops=1)
                     Buffers: shared hit=3460 read=4343 dirtied=237
                     I/O Timings: read=4026.017
               ->  Index Only Scan using index_epic_issues_on_epic_id_and_issue_id on public.epic_issues  (cost=0.42..0.59 rows=8 width=4) (actual time=0.128..0.217 rows=6 loops=7877)
                     Index Cond: (epic_issues.epic_id = epics.id)
                     Heap Fetches: 8288
                     Buffers: shared hit=29798 read=1464 dirtied=192
                     I/O Timings: read=1544.923
Summary
Time: 5.893 s
  - planning: 2.619 ms
  - execution: 5.891 s (estimated* for prod: 0.197...5.893 s)
    - I/O read: 5.571 s
    - I/O write: N/A

database-lab link

postgres.ai link

  • Querying group ids Epic_issues
SQL
SELECT epic_issues.* FROM epic_issues
      INNER JOIN epics ON epics.id = epic_issues.epic_id
      INNER JOIN issues ON issues.id = epic_issues.issue_id
      INNER JOIN projects ON projects.id = issues.project_id
      WHERE epics.group_id IN (9950, 9951, 9952, 9953, 9954, 9955, 9956, 9957, 9958, 9959, 9960, 9961, 9962, 9963, 9964, 9965, 9966, 9967, 9968, 9969, 9970, 9971, 9972, 9973, 9974, 9975, 9976, 9977, 9978, 9979, 9980, 9981, 9982, 9983, 9984, 9985, 9986, 9987, 9988, 9989, 9990, 9991, 9992, 9993, 9994, 9995, 9996, 9997, 9998, 9999, 10000)
        AND projects.namespace_id NOT IN (WITH RECURSIVE base_and_descendants AS (
          (SELECT namespaces.* FROM namespaces
            WHERE namespaces.type = 'Group'
            AND namespaces.id IN (9950, 9951, 9952, 9953, 9954, 9955, 9956, 9957, 9958, 9959, 9960, 9961, 9962, 9963, 9964, 9965, 9966, 9967, 9968, 9969, 9970, 9971, 9972, 9973, 9974, 9975, 9976, 9977, 9978, 9979, 9980, 9981, 9982, 9983, 9984, 9985, 9986, 9987, 9988, 9989, 9990, 9991, 9992, 9993, 9994, 9995, 9996, 9997, 9998, 9999, 10000))
          UNION
          (SELECT namespaces.* FROM namespaces, base_and_descendants
            WHERE namespaces.type = 'Group'
            AND namespaces.parent_id = base_and_descendants.id))
          SELECT namespaces.id
          FROM base_and_descendants AS namespaces);
EXPLAIN plan
Hash Join  (cost=779162.55..975583.37 rows=12464 width=16)
  Hash Cond: (issues.project_id = projects.id)
  ->  Gather  (cost=7041.66..62625.04 rows=24929 width=20)
        Workers Planned: 1
        ->  Nested Loop  (cost=6041.66..59132.14 rows=14664 width=20)
              ->  Hash Join  (cost=6041.10..16401.81 rows=14664 width=16)
                    Hash Cond: (epic_issues.epic_id = epics.id)
                    ->  Parallel Seq Scan on epic_issues  (cost=0.00..9821.61 rows=205361 width=16)
                    ->  Hash  (cost=5969.27..5969.27 rows=5746 width=4)
                          ->  Index Scan using index_epics_on_group_id_and_iid on epics  (cost=0.42..5969.27 rows=5746 width=4)
                                Index Cond: (group_id = ANY ('{9950,9951,9952,9953,9954,9955,9956,9957,9958,9959,9960,9961,9962,9963,9964,9965,9966,9967,9968,9969,9970,9971,9972,9973,9974,9975,9976,9977,9978,9979,9980,9981,9982,9983,9984,9985,9986,9987,9988,9989,9990,9991,9992,9993,9994,9995,9996,9997,9998,9999,10000}'::integer[]))
              ->  Index Scan using issues_pkey on issues  (cost=0.56..2.91 rows=1 width=8)
                    Index Cond: (id = epic_issues.issue_id)
  ->  Hash  (cost=520848.34..520848.34 rows=8934124 width=4)
        ->  Index Only Scan using index_projects_on_namespace_id_and_id on projects  (cost=14374.42..520848.34 rows=8934124 width=4)
              Filter: (NOT (hashed SubPlan 2))
              SubPlan 2
                ->  CTE Scan on base_and_descendants namespaces_2  (cost=14335.74..14369.74 rows=1700 width=4)
                      CTE base_and_descendants
                        ->  Recursive Union  (cost=0.43..14335.74 rows=1700 width=344)
                              ->  Index Scan using index_namespaces_on_type_and_id_partial on namespaces  (cost=0.43..106.32 rows=10 width=344)
                                    Index Cond: (((type)::text = 'Group'::text) AND (id = ANY ('{9950,9951,9952,9953,9954,9955,9956,9957,9958,9959,9960,9961,9962,9963,9964,9965,9966,9967,9968,9969,9970,9971,9972,9973,9974,9975,9976,9977,9978,9979,9980,9981,9982,9983,9984,9985,9986,9987,9988,9989,9990,9991,9992,9993,9994,9995,9996,9997,9998,9999,10000}'::integer[])))
                              ->  Nested Loop  (cost=0.56..1419.54 rows=169 width=344)
                                    ->  WorkTable Scan on base_and_descendants  (cost=0.00..2.00 rows=100 width=4)
                                    ->  Index Scan using index_namespaces_on_parent_id_and_id on namespaces namespaces_1  (cost=0.56..14.16 rows=2 width=344)
                                          Index Cond: (parent_id = base_and_descendants.id)
                                          Filter: ((type)::text = 'Group'::text)
Summary
Time: 2.373 min
  - planning: 10.689 ms
  - execution: 2.373 min (estimated* for prod: 0.000 s)
    - I/O read: 2.110 min
    - I/O write: N/A

database-lab link

postgres.ai link

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Eugenia Grieff

Merge request reports