Reduce contention on updating runner_id in ci_builds
From https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/8395#note_243107996:
On the master, UPDATE ci_builds ..
were taking 3/4 of all total_time
today, see K003 for the master:
This becoming more and more noticeable: every second, 8 out of 12 seconds spent by the master's CPUs were doing this work.
It's a contention problem - queries are blocked by some other queries that work on the same records. Note how low shared_blks_hit
and shared_blks_read
are, we don't have a lot of data to process, we just wait being blocked. It leads to having more and more Postgres backends blocked and sitting in active
state. We should avoid this.
Can we rework it to use SELECT .. FOR UPDATE SKIP LOCKED
or SELECT .. FOR UPDATE NOWAIT
? This would completely solve the problem of waiting being blocked. Similar work was done recently for merge_requests
by @stanhu, see !18481 (diffs).
This SQL query is likely happening in assign_runner!
: https://gitlab.com/gitlab-org/gitlab/blob/febfd21ee6733f11b9713ab7ee968bae190c10ef/app%2Fservices%2Fci%2Fregister_job_service.rb#L95-116