Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing)
Currently we have very expensive query run on top of ci_builds
where we (gitlab-com/gl-infra/production#3712 (closed)):
- look for matching projects
- look for pending builds
- match tags
- match other filters
- look at quota
This is a problem:
-
ci_builds
is expensive to access: this is very wide table that often times out -
ci_builds
cannot be partitioned as otherwise we would not be able to fetch all jobs - for accessing
tags
we cross-join another tabletaggings
- for accessing quota we cross-join
project/namespace
- we check access level based on
project/namespace
As a way to accelerate filtering:
- Introduce
ci_pending_builds
table - Design table so we would not have to load
ci_builds
(a very wide table) as part of query as part ofRegisterJobService
for filtering - We would still load
ci_builds
for the purpose of accepting build, but the filtering should be significantly faster and provide more capacity - This would allow us to make
ci_builds
partitioned without breaking queueing - Table would consist as much data as possible to perform build matching: at least
tags
,protected
,project_id
, and whatever else is needed - Insert build to table on status transition to
pending
as part of state machine - Delete item from table on status transition from
pending
as part of state machine - Change
RegisterJobService
to filter usingci_pending_builds
instead ofci_builds
- We assume that queries would have a significantly lower cost, as we would have much easier and cheaper to access data, and be able to hold this pending queue in memory of postgres for quick filtering
This acceleration structure is proposed as a follow-up on gitlab-com/gl-infra/production#3712 (closed). If designed properly this could be used for all future work on queueing as well. This can be an easy way to improve performance today without spending a lot of effort on it.
This can be a way to improve performance today, with a potential throw-away solution without a lot of impact on a codebase (hopefully)_.
Edited by Grzegorz Bizon