Skip to content

Monitor scheduled pipeline creation latency

The Problem

As GitLab has grown over the years, we've periodically struggled with creating lots of scheduled pipelines at once. Most recently, a customer opened an issue about long delays between the cron-scheduled time and the actual creation time:

Pipeline schedules experiencing delays >1 hour ... (#564744 - closed)

As we continue to scale, it would be good for us to pay attention to the ongoing quality of this feature.

(A) Proposal(s)

As we grow and scale our CI platform, we should pay attention to the amount of time it takes us to create a Pipeline once it's schedule time passes. This isn't to say that we expect every schedule pipeline to be created instantly; there are known and designed limitations in place.

However, it would be good to know what sort of timeframes are typical: p50, p95, p99, etc. This is something we could throw together in a relatively simple Kibana chart, or add as a custom metric to our Grafana dashboard.

Caveat: If we add this to Grafana, we should be extra careful in the feature design and understand what SLOs we can truly commit to before it gets baked into our stage apdex.

Edited by drew stachon