Skip to content

Fix project transfer corrupting shared runners state

Adrien Kohlbecker requested to merge ak/fix-project-transfer into master

What does this MR do?

See #271728 (closed)

Fixes an issue where a project transferred to a group that disables the use of shared runners can still access the shared runners.

The database is in an inconsistent state, the project below should have shared_runners_enabled: false:

[ gprd ] production> project = Project.find_by_full_path('issue-testing/test-11_02')
=> #<Project id:22177621 issue-testing/test-11_02>>
[ gprd ] production> project.shared_runners_enabled
=> true
[ gprd ] production> project.shared_runners.count
=> 15
[ gprd ] production> project.group.shared_runners_setting
=> "disabled_and_unoverridable"

To fix this, a post-deployment migration is required to make sure the inconsistent state is reverted. Could the database folks confirm that this will indeed fix the problem on dot com and self-managed?

This was introduced in 13.5 by !36080 (merged), so I'm setting ~"Pick into 13.5" and ~"Pick into 13.6" based on the database becoming inconsistent.

Migration output

➜ ak@Adriens-MacBook-Pro gitlab git:(ak/fix-project-transfer) bundle exec rake db:rollback 
== 20201110161542 CleanupTransferedProjectsSharedRunners: reverting ===========
== 20201110161542 CleanupTransferedProjectsSharedRunners: reverted (0.0000s) ==

➜ ak@Adriens-MacBook-Pro gitlab git:(ak/fix-project-transfer) ✗ bundle exec rake db:migrate 
== 20201110161542 CleanupTransferedProjectsSharedRunners: migrating ===========
== 20201110161542 CleanupTransferedProjectsSharedRunners: migrated (0.0683s) ==

Plan

explain UPDATE "projects" SET "shared_runners_enabled" = false WHERE "projects"."namespace_id" IN (SELECT "namespaces"."id" FROM "namespaces" WHERE "namespaces"."id" >= 8161*1000 AND "namespaces"."id" < (8161+1)*1000 AND "namespaces"."shared_runners_enabled" = false AND "namespaces"."allow_descendants_override_disabled_shared_runners" = false)

 ModifyTable on public.projects  (cost=0.87..1338.77 rows=2 width=803) (actual time=4658.429..4658.432 rows=0 loops=1)
   Buffers: shared hit=26207 read=3140 dirtied=1918
   I/O Timings: read=4442.447
   ->  Nested Loop  (cost=0.87..1338.77 rows=2 width=803) (actual time=24.947..1063.581 rows=430 loops=1)
         Buffers: shared hit=870 read=646 dirtied=11
         I/O Timings: read=1047.070
         ->  Index Scan using namespaces_pkey on public.namespaces  (cost=0.43..1310.57 rows=1 width=10) (actual time=16.982..308.121 rows=217 loops=1)
               Index Cond: ((namespaces.id >= 8161000) AND (namespaces.id < 8162000))
               Filter: ((NOT namespaces.shared_runners_enabled) AND (NOT namespaces.allow_descendants_override_disabled_shared_runners))
               Rows Removed by Filter: 714
               Buffers: shared hit=212 read=220
               I/O Timings: read=304.286
         ->  Index Scan using index_projects_on_namespace_id_and_id on public.projects  (cost=0.43..28.02 rows=18 width=752) (actual time=1.686..3.470 rows=2 loops=217)
               Index Cond: (projects.namespace_id = namespaces.id)
               Buffers: shared hit=658 read=426 dirtied=11
               I/O Timings: read=742.784

Expected timing

9206978 records to update (size of `namespaces` at 2020-11-21)
Number of jobs enqueued = 9206978 / 25000 = 369 (each job processes 25K namespaces)

Each job processes namespaces in 1k batches
batch size = 1000
Estimated times per 1k batch: 4.5 sec worst case scenario in `#database-lab`
Worst case scenario Total time per job: 4.5 * 25 = 112.5 sec 

Delay between sidekiq jobs = 3 minutes (to cover cases when due to locks or other parameters the job may take even more than 120sec)

Total time for all background jobs to run = 369 * 3min = 1107 min = 18.45 hours

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Yannis Roussos

Merge request reports