Visualize percentiles of estimated job wait time for instance runners over 3 hours
Insight
Wait time was voted the second most important feature on the dashboard. All users wanted to be able to further configure the chart so they could identify trends of specific groups of runners (filtered by varied data) over a specific period of time.
Supporting evidence
Runners. Yeah. That could be particularly useful for us and, and a really good indicator of a problem, like if we misconfigured something and our runners quit working or, or if GitLab was having some kind of issue and the runners stopped receiving jobs or whatever, that would be a great indicator depending on how fast this is updated, it would be very useful for us at least. And then what's the group here and project?
It's mostly reactive for us at this point anyway when we're, as you know, we get users reporting problems, we might want to go look at wait times here. Okay. And be able to dig down on those to see, you know, who's waiting
I envision that people are submitting or jobs are getting submitted on behalf of people and as the runners potentially scale up to their limit, if there are jobs that are waiting for a particular set of tags to become available or runner that will run those particular tags to become available, they might wait. And I, I wanna know if if like their jobs are suddenly a lot of jobs starting to wait for runners.
I think this label is not helpful at all. I think look like, look, looking at it as like line over time is a lot more, like the visual thing is a lot more easier, at least for me to understand. Okay. Than a like growth percentage. Yeah. Or like it's, I I think like that's why I looked at the graph because that's, that's where I would like to see it. Okay. It's a lot easier for the brain.
I, I think, well if there's problems, I guess the shorter timeline is fine, but I think like for a longer, like I think the, the need of more runners, for example, it will probably like slowly grow over time depending on the, like what you're working on or getting more, more things to work on or being more active in certain things that have more jobs or things like that. So, so like a longer period of time. But I would say is is like, it it like looking at the trend of how long do we need to wait at least like thinking of our like setup if we would like know this kind of, we don't really know how long the wait time is. But that would be very interestingly, like is there any, it's more of a gut feeling that now, now, like a lot of jobs have been waiting this day, that's probably like we would maybe need more, more runners. Yeah. So, so like the trend is, is more interesting, like do we need to prioritize looking into maybe complicating our setup or just throwing more money at it or, or something like it's, or is are we fine with this wait time? When are we not fine anymore? Like then we could look at the trend and say that it's probably like after summer we need to do something about that. Like looking at how it has been growing.
Well, for, for our setup, these numbers would be like pretty close to zero. Yeah. So this, this thing, this overview of, of everything wouldn't be, I guess that, that helpful. Yeah, maybe. Yeah. I think like if it would be possible to get metrics on the runners, that would be super helpful. Like the wait time, like that this, this graph would be super valuable or for the, the group runners that we have.
Action - Proposal
Add a card on the Fleet Overview Dashboard for Wait time to pick up a job
with the following:
- Line chart of percentiles of estimated job wait time for instance runners over the last 3 hours.
- Time should be on the X-axis and, ideally, use the timezone that the user has added to their preferences. If they don't have a timezone added, use UTC.
- Wait time should be on the Y-axis.
- The user should be able to hover over data points for each hour (or half hour, whatever we decide to break up the time on the X-axis) so they can see the different percentile values at that time.
- Mean wait time for instance runners using the single stat component.
Resources
-
🕊 Dovetail project -
🔍 Research issue -
👣 Follow-up issue or epic
Tasks
-
Assign this issue to the appropriate Product Manager, Product Designer, or UX Researcher. -
Add the appropriate Group
(such as~"group::source code"
) label to the issue. This helps identify and track actionable insights at the group level. -
Link this issue back to the original research issue in the GitLab UX Research project and the Dovetail project. -
Adjust confidentiality of this issue if applicable