Skip to content

Add instrumentation to address pipeline creation performance

Follow-up from @grzesiek's comment:

Take a look at this pipeline creation duration histogram.

We see that for 99 percentile this sometimes take 20 seconds to create a pipeline. This is using rate function that calculates an average over 1 minute range vector, and we exclude 1% of the worst cases, so it is possible that this can get much worse.

PromQL used here:

histogram_quantile(0.99, sum(rate(gitlab_ci_pipeline_creation_duration_seconds_bucket[1m])) by (le))

pipeline_creation_p99_6h

This is data from the last 6 hours. This looks much worse in the last 12 hours:

pipeline_creation_p99_12h

Next steps suggested

The next step I would suggest is instrumenting pipeline creation chain (including fetching and merging includes) with additional histograms that could help to identify possible causes of slowness. The slowness might be related to calculating variables, but might be related to fetching external includes, or to something else. Additional instrumentation will help to uncover the mystery 😸

Proposal

  • Add instrumentation for each step of the chain
  • Add charts to Verify:PE Grafana Dashboard
  • Add more granular instrumentation (e.g. includes processing or Seed::Build) if necessary
Edited by Fabio Pitino