Optimise `builds.each` of `RegisterJobService`
Problem
Currently the RegisterJobService
uses builds.each
to iterate next build to pick, but:
- sometimes we fetch 1k of objects
- we allocate and populate all objects at once
- we over-allocate memory
- sometimes queue depth is 1-3 which means that the object returned is already stale and this increases amount of
409
andcan_pick?
andInvalidStateMachine
type of errors
Possible optimisations
We cannot really use .find_each(of: 1)
, as this impose limit
and requires predictable sorting, where-as builds
are sorted based on different criteria, ones not being to be sequential ordering (in some cases). What we can do:
- fetching each build individually
- it should have still a very big positive effect on system due to reduction of picking violations
Change the loop to be:
builds_id = builds.pluck(:id)
histogram_queue_size.observe(builds_id.count)
builds_id.each do |build_id|
build = Ci::Build.find(build_id)
We could actually test this hypothesis by changing iterator schema via Feature Flag.
Metrics to look at
- Amount of SQL queries
db_count
- Amount of allocated
mem_bytes
- The
queue_depth
(aka how manybuilds_id
we had to iterate to pick a next build)
For each of them we should see a reduction in used resources in total (for the whole system).
Edited by Kamil Trzciński