Unlink fork relationships when archiving group

What does this MR do and why?

Related: https://gitlab.com/gitlab-org/gitlab/-/issues/572482

The previous MR removes the fork relationship and closes the MRs when the upstream project is archived. However, it only works for individual projects. When projects belong to a group and the group is archived, the fork relationships remain active and the MRs stay open.

This MR fixes the problem by removing fork relationships and closing MRs when a group is archived.

The Rack::Timeout of 60 seconds would terminate the archive API request if the group has many projects, each with numerous open MRs. Therefore, unlinking fork relationships and closing open MRs are processed asynchronously using a background worker.

Query Plans

The postgre.ai link

I tested using https://gitlab.com/gitlab-org. Since the worker processes at most 100 projects per batch, I collected 100 project IDs to test the query plan at that scale

(In rails c)
# Fetch and parse
  response = Net::HTTP.get(URI("https://gitlab.com/api/v4/groups/9970/projects?per_page=100"))
  projects = JSON.parse(response)
  production_project_ids = projects.map { |p| p['id'] }

  # Optional: just check how many you got
  puts "Got #{production_project_ids.size} IDs"

  # Now generate SQL using the IDs directly
  local_project_ns_ids = [125, 183]
  sql = Project.where(project_namespace_id: local_project_ns_ids).in_fork_network.with_fork_network_associations.to_sql

  # Replace with production IDs
  sql = sql.gsub("IN (#{local_project_ns_ids.join(', ')})", "IN (#{production_project_ids.join(', ')})")

  puts 'EXPLAIN ' + sql
 Nested Loop Left Join  (cost=2.42..713.22 rows=7 width=1769) (actual time=31.489..300.785 rows=4 loops=1)
   Buffers: shared hit=382 read=344
   I/O Timings: read=296.436 write=0.000
   ->  Nested Loop Left Join  (cost=1.85..702.67 rows=7 width=925) (actual time=24.195..286.590 rows=4 loops=1)
         Buffers: shared hit=378 read=333
         I/O Timings: read=282.424 write=0.000
         ->  Nested Loop Left Join  (cost=1.42..699.55 rows=7 width=872) (actual time=21.388..277.849 rows=4 loops=1)
               Buffers: shared hit=372 read=323
               I/O Timings: read=273.808 write=0.000
               ->  Nested Loop  (cost=0.99..675.42 rows=7 width=864) (actual time=21.370..277.781 rows=4 loops=1)
                     Buffers: shared hit=356 read=323
                     I/O Timings: read=273.808 write=0.000
                     ->  Index Scan using index_projects_on_project_namespace_id on public.projects  (cost=0.56..330.67 rows=100 width=848) (actual time=5.546..224.025 rows=68 loops=1)
                           Index Cond: (projects.project_namespace_id = ANY ('{75577641,74166478,74138654,71037312,62674588,62039593,58264541,52469398,51082400,48057007,46649240,45393508,44029359,43186173,40916776,39455850,37966217,36791198,36489224,36454318,35337613,35104827,34936209,34770409,34675721,33180782,32987018,31742899,30677483,30677418,30587963,29969475,29930187,28636453,27686541,27681689,27490309,27244047,26761713,25981420,25861038,25847700,25418438,25416006,25402115,25347179,25205782,25033712,25031023,23617987,23156104,23105702,22874770,22057235,21967100,21967079,21808150,21751817,21751536,21597602,21565866,21479995,21439066,20904766,20510065,20468480,20428564,20059805,19657914,19461377,19080884,18943607,18863350,18860383,18741849,18629149,18594390,18505515,18331927,18307889,18307741,18060348,17930014,17661412,17522813,17345914,17334694,17318793,16842968,16603968,16573099,16110032,15904902,15815706,15687385,15445353,15363819,15297693,15158038,14771920}'::bigint[]))
                           Buffers: shared hit=231 read=240
                           I/O Timings: read=221.462 write=0.000
                     ->  Index Scan using index_fork_network_members_on_project_id on public.fork_network_members  (cost=0.43..3.45 rows=1 width=16) (actual time=0.785..0.785 rows=0 loops=68)
                           Index Cond: (fork_network_members.project_id = projects.id)
                           Buffers: shared hit=125 read=83
                           I/O Timings: read=52.346 write=0.000
               ->  Index Scan using index_fork_network_members_on_project_id on public.fork_network_members fork_network_members_projects_join  (cost=0.43..3.45 rows=1 width=12) (actual time=0.010..0.010 rows=1 loops=4)
                     Index Cond: (fork_network_members_projects_join.project_id = projects.id)
                     Buffers: shared hit=16
                     I/O Timings: read=0.000 write=0.000
         ->  Index Scan using fork_networks_pkey on public.fork_networks  (cost=0.42..0.45 rows=1 width=57) (actual time=2.179..2.179 rows=1 loops=4)
               Index Cond: (fork_networks.id = fork_network_members_projects_join.fork_network_id)
               Buffers: shared hit=6 read=10
               I/O Timings: read=8.616 write=0.000
   ->  Index Scan using projects_pkey on public.projects forked_from_projects_projects  (cost=0.56..1.51 rows=1 width=848) (actual time=3.537..3.537 rows=1 loops=4)
         Index Cond: (forked_from_projects_projects.id = fork_network_members_projects_join.forked_from_project_id)
         Buffers: shared hit=4 read=11
         I/O Timings: read=14.012 write=0.000
Settings: jit = 'off', effective_cache_size = '472585MB', seq_page_cost = '4', work_mem = '100MB', random_page_cost = '1.5'
Time: 309.089 ms
  - planning: 8.032 ms
  - execution: 301.057 ms
    - I/O read: 296.436 ms
    - I/O write: 0.000 ms

How to set up and validate locally

  1. Enable archive_group, destroy_fork_network_on_archive, and destroy_fork_network_on_group_archive
  2. Create a group
  3. Create a project under the group
  4. Fork the project
  5. Create a MR from the fork
  6. Go back to the group -> settings -> general -> archive
  7. Check if the project has been also archived, if the fork relationship has been removed, and the open MR has been closed.
  8. If you check all boxes, mission accomplished!!!

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Emma Park

Merge request reports

Loading