Skip to content

Improve `PipelineProcessService`

Summary

Currently PipelineProcessService is executed a multiple times. This results in a quite an overhead on the amount of compute that it uses.

We should (my random notes of improvements of this service, we can do a lot to make it take likely 10% of current time).

  1. Debug all SQL queries being executed,
  2. De-duplicate the jobs being executed, being done by https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31370,
  3. Make pipeline processing more efficient,
  4. Serialise updates created by processing of individual jobs during the process!,
  5. Make updates of stages to be targeted to builds being triggered by process!,
  6. Ignore update of pipeline if build was retried,
  7. Be aware of when: to update only builds that are in fact affected,
  8. Remove any potential N+1, try to pre-calculate status in SQL only, or be clever to discover which builds that are "created" should be processed,
  9. We use pipeline.builds. to gather status of prior stages, this seems to be bug as we should be using pipeline.statuses (to also include bridges),
  10. Ensure that we always use .find_each,
  11. Ensure that update of stages are sequential across all concurrent runs,
  12. Remove rubocop offenses,
  13. Remove deprecated code for update_retried as we no longer need it,
Edited by Kamil Trzciński