Average cost per MR pipeline has increased over the last 5 days
Problem
Why has average cost per pipeline increased from 2020-05-11 to 2020-05-15?
Contributing factors
- There were 188 builds on 2020-05-13 to 2020-05-14 that had a duration of more than double the maximum timeout.
- There was a two and a half fold increase of Runner System and Timeout errors during the week of 2020-05-11 which led to a larger number of builds executing for 90 minutes than in previous weeks.
Hypothesis
- A change has caused pipelines to enter stuck or unique states more frequently than previously seen
- An example of a "unique state" is builds that are drastically exceeding the timeout that were cancelled: https://gitlab.com/gitlab-org/gitlab/-/jobs/550823657
Open Questions
- Why is the build duration exceeding more than double the maximum_timeout?
- Is there anything that would have contributed to the increase in stuck/timeout or Runner system errors during the week of 2020-05-11?
Edited by Kyle Wiebers