Skip to content

Add foreign keys to lfs_objects_projects table [RUN ALL RSPEC] [RUN AS-IF-FOSS]

Igor Drozdov requested to merge id-add-foreign-keys-to-lfs-objects-projects into master

What does this MR do?

We need it in order to maintain consistency when an lfs_object is deleted: !56959 (merged)

This merge request adds:

  • a restrict foreign key for lfs object, that means that an lfs object can't be deleted if it's linked to a project
  • since we're here, let's also add a foreign key for a project, so when a project is removed, it cleans up all the linked lfs-objects-projects records

The foreign keys are introduced with validate false in order to ignore existing orphaned records but at the same time to provide consistency while we're removing the orphaned records.

This merge request also adds a post-deploy background migration to clean up orphaned records. The migration cleans up lfs_objects_projects orphaned records (records without a project or an lfs object)

The number of lfs_objects_projects on SaSS: 73747276
Batch size: 50000
Interval: 2 minutes
The job is going to execute: ~2 days

Queries:

SELECT "lfs_objects_projects".*
FROM "lfs_objects_projects"
LEFT OUTER JOIN "projects" ON "projects"."id" = "lfs_objects_projects"."project_id"
WHERE "projects"."id" IS NULL
  AND lfs_objects_projects.id > 60000000
  AND lfs_objects_projects.id < 60050000
Gather  (cost=1001.13..54661.24 rows=1 width=30) (actual time=126.656..134.797 rows=0 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Buffers: shared hit=213063 read=196
   I/O Timings: read=73.365
   ->  Nested Loop Anti Join  (cost=1.13..53661.14 rows=1 width=30) (actual time=123.223..123.224 rows=0 loops=2)
         Buffers: shared hit=213063 read=196
         I/O Timings: read=73.365
         ->  Parallel Index Scan using lfs_objects_projects_pkey on public.lfs_objects_projects  (cost=0.57..7111.69 rows=23973 width=30) (actual time=4.131..62.960 rows=21660 loops=2)
               Index Cond: ((lfs_objects_projects.id > 60000000) AND (lfs_objects_projects.id < 60050000))
               Buffers: shared hit=36914 read=133
               I/O Timings: read=52.322
         ->  Index Only Scan using projects_pkey on public.projects  (cost=0.56..1.93 rows=1 width=4) (actual time=0.002..0.002 rows=1 loops=43319)
               Index Cond: (projects.id = lfs_objects_projects.project_id)
               Heap Fetches: 1388
               Buffers: shared hit=176149 read=63
               I/O Timings: read=21.043
Time: 139.488 ms
  - planning: 4.627 ms
  - execution: 134.861 ms
    - I/O read: 73.365 ms
    - I/O write: N/A

Shared buffers:
  - hits: 213063 (~1.60 GiB) from the buffer pool
  - reads: 196 (~1.50 MiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

SELECT "lfs_objects_projects".*
FROM "lfs_objects_projects"
LEFT OUTER JOIN "lfs_objects" ON "lfs_objects"."id" = "lfs_objects_projects"."lfs_object_id"
WHERE "lfs_objects"."id" IS NULL
  AND lfs_objects_projects.id > 50000000
  AND lfs_objects_projects.id < 50050000
 Gather  (cost=1001.13..59769.95 rows=1 width=30) (actual time=236.544..239.132 rows=0 loops=1)
   Workers Planned: 1
   Workers Launched: 1
   Buffers: shared hit=186752 read=779
   I/O Timings: read=298.503 write=0.000
   ->  Nested Loop Anti Join  (cost=1.13..58769.85 rows=1 width=30) (actual time=230.495..230.496 rows=0 loops=2)
         Buffers: shared hit=186752 read=779
         I/O Timings: read=298.503 write=0.000
         ->  Parallel Index Scan using lfs_objects_projects_pkey on public.lfs_objects_projects  (cost=0.57..7720.99 rows=26728 width=30) (actual time=4.813..73.856 rows=23063 loops=2)
               Index Cond: ((lfs_objects_projects.id > 50000000) AND (lfs_objects_projects.id < 50050000))
               Buffers: shared hit=2204 read=600
               I/O Timings: read=122.908 write=0.000
         ->  Index Only Scan using lfs_objects_pkey on public.lfs_objects  (cost=0.56..1.90 rows=1 width=4) (actual time=0.006..0.006 rows=1 loops=46126)
               Index Cond: (lfs_objects.id = lfs_objects_projects.lfs_object_id)
               Heap Fetches: 0
               Buffers: shared hit=184548 read=179
               I/O Timings: read=175.596 write=0.000
Time: 241.389 ms
  - planning: 2.166 ms
  - execution: 239.223 ms
    - I/O read: 298.503 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 186752 (~1.40 GiB) from the buffer pool
  - reads: 779 (~6.10 MiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Number of the affected rows:

lfs_objects_projects linked to non-existing lfs_object: 133

lfs_objects_projects linked to non-existing project: 42679

Related issue: #199261 (closed)

Edited by Igor Drozdov

Merge request reports