Understand why scheduled master pipelines successful rates are very low
Since we didn't have pipeline stability dashboard on Snowflake which we can play around with, I decided to create one: https://app.snowflake.com/ys68254/gitlab/#/pipeline-successful-rates-d4GqRCjeZ
Insert the issue where I talked about why I think tracking retries might reflect productivity better:
- https://gitlab.com/gitlab-org/quality/engineering-productivity/team/-/issues/364#note_1757757072
- https://gitlab.com/gitlab-org/quality/engineering-productivity/team/-/issues/364#note_1762555830
- https://gitlab.com/gitlab-org/quality/engineering-productivity/team/-/issues/364#note_1762804622
- https://gitlab.com/gitlab-org/quality/engineering-productivity/snowflake-dashboard-sql/-/issues/34
A few conclusions I draw from this dashboard:
- Looking at all master pipelines, the successful rates are good (95% ~ 98%), and NOT retried rates are a bit lower but close enough (94% ~ 95%)
- It's a much different story for scheduled master pipelines though:
- For nightly it's much varying, from 15% to 50%, mostly 25% recently. We can safely say that it's extremely unstable
- For 2-hourly it's very surprising: The successful rate is also very bad, roughly at 50%, however if we look at retries, starting from June we basically retried ALL of the pipelines, and even then we can only keep it at 50% successful.
Why? Let's find the answers for the following questions:
- Why are scheduled pipelines successful rates so low? Nightly about 25% and 2-hourly about 50%
- What's the impact of this disparity?
- Did we really retry ALL scheduled 2-hourly pipelines on June and July?
- Why are we retrying so hard but the rates are still quite low? Is retrying helpful? Or is it giving us some illusions?
- How can we measure how pipeline stability affects productivity in people's daily merge requests?
- There are a lot of reasons a merge request pipeline can fail, for example it can be legitimate bugs, but here's the charts for merge requests as a reference: To reiterate, I think looking at how often merge requests pipeline are retried can kind of reflect productivity lose from pipeline instability and people's lack of confidence about it.
Edited by Lin Jen-Shin