Show runner failure rates in Fleet Dashboard - Groups

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Insight

Runner failures were consistently voted as the most important feature on the dashboard when validating the solution in https://gitlab.com/gitlab-org/ux-research/-/issues/2403. Being able to see the trends of runner failures over time would be useful to identify larger problems that cause more blackouts with the runner performance. It would identify that action would need to be taken to fix a problem and how often those failures happen (are they a Friday night thing or an all-the-time thing).

Supporting evidence

So, so it would be very interesting to see the error rates, like because of runners dying in the middle of something, for example, as like the docker engine thing I mentioned earlier that that's, that's very valid because that, that's something like as, as mentioned like we run on on spot instances. So if that be like, if that is becoming a bigger of a problem that runners are like dying all the time too much and it starts to like make people annoyed, that would be very interesting to see at, is there like a trend that, that this is happening more and more? Do we need to kind of do something about like trying to make sure that like, like, or do something about the automatic retry for example. We don't currently, it's not that big of a problem, but when is it big enough of a problem that we would need to spend a little time in, in looking into how could we make the, the, the jobs a bit more like robust for these runner failures that okay. Are happening.

Action

  • Add a stat of failure rate across group runners owned by that group
  • Include a view details link that takes the user to a full page with a chart of the failure trends as well as a table of failure rates per runner
  • Add failure rate to runner details page

Resources

Tasks

  • Assign this issue to the appropriate Product Manager, Product Designer, or UX Researcher.
  • Add the appropriate Group (such as ~"group::source code") label to the issue. This helps identify and track actionable insights at the group level.
  • Link this issue back to the original research issue in the GitLab UX Research project and the Dovetail project.
  • Adjust confidentiality of this issue if applicable

This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.

Edited by 🤖 GitLab Bot 🤖