Update to measure Mean Time to First Failure
From our recent Key Review https://docs.google.com/document/d/1xbXfSwHrJZvtJIpj0t45IkWy_sEzEXbDj_56dSSJTQc/edit#bookmark=id.7emgak9n7qlv
- Mek: In video, but maybe not in written agenda. Have average time to failure, improved in April, uptick in May. Given data lag, believe will smooth out.
- Kyle: This is full pipeline duration, not time to first failure. https://about.gitlab.com/handbook/engineering/quality/performance-indicators/#gitlab-project-merge-request-pipeline-average-time-to-failure
- Sid: I don’t understand, it’s not called pipeline duration right now?
- Kyle: How long it takes for a pipeline to finish given it fails.
- Sid: As a developer, I’d like to see a shorter time to first failure, so I know where to focus.
- Kyle: That’s something we can look at more closely. When we experimented we got feedback that people want the full set of feedback rather than short circuiting everything else that runs.
- Sid: Not proposing stopping the rest either. The idea of this is I want to know if there is a problem and that it’s not green. The way we will improve this is by scheduling tests most likely to fail early. Run the ones that failed last time first. It will help bring time down and help developers. After that keep running and give all the failures. It’s really average time to first failure that I’m after to improve the feedback cycle for the developer.
- Mek: we will rename to average time to first failure
Task
-
Measure the time to first failure -
Rename KPI to Average time to first failure
-
Update the KPI description to remove link to this issue
Edited by Mek Stittri