Skip to content

Rebalance pipeline sidekiq nodes so that are able to handle all pipeline workloads

As discussed in gitlab-com/www-gitlab-com#4951 (comment 200475388), the number of pipeline jobs that we handle on GitLab.com is much higher than number of pipeline jobs that are produced.

The only reason we manage to keep up with the number of jobs that are produced is because the besteffort nodes are configured (by accident or deliberately, it's hard to know) to pick up the majority of pipeline jobs.

This issue tracks the task of ensuring that the pipeline queue is right-sized to handle all pipeline jobs.

Once this is done, we will be able to stop sending pipeline jobs to besteffort workers.

Steps

  • Analyse the pipeline queue traffic so that we better understand the jobs that are running
  • Categorise the jobs and propose SLOs for each category
  • Optimize the cluster based on the SLOs
  • Work out how many more nodes the pipeline priority would need to pickup the slack currently handled by `besteffort
  • Scale the fleet up
  • Reconfigure the priority queue rules to prevent the besteffort nodes from picking pipeline jobs up