Improve `PipelineProcessService`
Summary
Currently PipelineProcessService
is executed a multiple times.
This results in a quite an overhead on the amount of compute that it uses.
We should (my random notes of improvements of this service, we can do a lot to make it take likely 10% of current time).
- Debug all SQL queries being executed,
- De-duplicate the jobs being executed, being done by https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/31370,
- Make pipeline processing more efficient,
- Serialise updates created by processing of individual jobs during the
process!
, - Make updates of stages to be targeted to builds being triggered by
process!
, - Ignore update of pipeline if build was retried,
- Be aware of
when:
to update only builds that are in fact affected, - Remove any potential
N+1
, try to pre-calculatestatus
inSQL
only, or be clever to discover which builds that are "created" should be processed, - We use
pipeline.builds.
to gather status of prior stages, this seems to be bug as we should be usingpipeline.statuses
(to also include bridges), - Ensure that we always use
.find_each
, - Ensure that update of
stages
are sequential across all concurrent runs, - Remove
rubocop
offenses, - Remove deprecated code for
update_retried
as we no longer need it,
Measurements
A comprehensive explanation of all of the improvements can be found here: #197930 (comment 320405586)
Roll up summary
Max Duration percentile for Pipeline Workers
- spikes were reduced for PipelineProcessWorker, and maximum duration for both 95% and 90% percentiles are reduced by 50%. DB Duration for Pipeline Workers
- Reduced maximum peaks for PipelineProcessWorker, 25% decrease in maximum duration value. CPUs for Pipeline Workers
- There is a reduction in spikes, maximum for PipelineProcessWorker reduced by 18% Average Duration percentile for pipelines.json and pipelines/:id.json
- ~13% faster for 95th percentile for pipelines.json
- ~13% slower for 95th percentile, but 16% faster in 90th percentile for pipelines/:id.json Max Duration percentile for pipelines.json and pipelines/:id.json
- ~77% faster with ci_composite_status for 95th percentage for pipelines.json
- ~86% faster with ci_composite_status for 90th percentage for pipelines.json CPUs for pipelines.json and pipelines/:id.json
- CPU's 11% better for pipelines.json, 35% better at the maximum value for pipelines.json
Accumulative duration per pipleine_id for PipelineProcessWorker
- Average 68% faster when ci_atomic_processing was enabled
- Maximum 75% faster when ci_atomic_processing was enabled
Max Duration percentile for Pipeline Workers
- ~20% faster average time for PipelineProcessWorker.
- We can see that spikes were reduced for PipelineProcessWorker, and the maximum duration for both 95% and 90% percentiles are reduced by 21%.
- There are no significant changes in PipelineUpdateWorker and StageUpdateWorker
DB Duration for Pipeline Workers
- 15% faster Average DB time for PipelineProcessWorker
- No significant changes in average/maximum database times for rest of Pipeline Workers.
- Increased maximum peak for PipelineProcessWorker, 12% increase in maximum duration value.
CPUs for Pipeline Workers
- No significant changes in average/maximum CPU's for Pipeline Workers.
Average duration and DB duration for pipelines.json and pipelines/:id.json
- Total duration is ~27% faster with ci_atomic_processing for pipelines.json api
- DB duration is 20% faster with ci_atomic_processing for pipelines.json api
- Total duration is ~32% faster with ci_atomic_processing for pipelines/:id.json api
- DB duration is ~28% faster with ci_atomic_processing for pipelines/:id.json api
Maximum duration and DB duration for pipelines.json and pipelines/:id.json
- Max duration is 35% faster with ci_atomic_processing for pipelines.json api
- Max DB duration is 12% faster with ci_atomic_processing for pipelines.json api
- Max duration is 15% faster with ci_atomic_processing for pipelines.json api
- Max DB duration is 7% faster with ci_atomic_processing for pipelines.json api
Average Duration percentile for pipelines.json and pipelines/:id.json
- ~27% faster for 95th percentile for pipelines.json
- ~19% faster for 90th percentile for pipelines.json
- ~25% faster for 50th percentile for pipelines.json
- ~20% faster for 95th percentile for pipelines/:id.json
- ~16% faster for 90th percentile for pipelines/:id.json
- ~41% faster for 50th percentile for pipelines/:id.json
Max Duration percentile for pipelines.json and pipelines/:id.json
- ~98% faster for 95th percentile for pipelines.json
- ~93% faster for 90th percentile for pipelines.json
- ~2% slower for 50th percentile for pipelines.json
- ~13% faster for 95th percentile for pipelines/:id.json
- ~1% slower for 90th percentile for pipelines/:id.json
- ~18% slower for 50th percentile for pipelines/:id.json
CPUs for pipelines.json and pipelines/:id.json
- CPU's ~26% better for pipelines.json, ~50% better at the maximum value for pipelines.json
- CPU's ~31% better for pipelines.json, ~17% better at the maximum value for pipelines.json