Merge request often take a long time to be complete due to a long review time. With this graph, we would like to understand whether there are particular assignees / reviewers (implied by the people that comment, merge and author), that usually drive longer times
Intended users
Further details
Proposal
As a next chart on the productivity analytics page, let's add a sankey diagram where we see:
authors
commenters
approvers
mergers
We also want to see how many comments and follow-on commits exist, so we can make the lines different colors based on the back and forth, but we will leave this for another iteration.
Permissions and Security
Documentation
Testing
What does success look like, and how can we measure that?
@matejlatin - I think this is actually not a small issue since our review process is not very explicit. To give you an example, https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/14951#note_202057199. Mayra removes herself when she is done reviewing as an assignee and when Pavel is ready for her review, he adds her back (we cannot rely on clients using the same process though), but maybe we can have author as the owner of the MR and assignees as the reviewers (not ideal, but better).
What I think is clear that showcases why there is more time taking is also:
the number of comments (size, files touched, etc, we already have)
number of iterations (I don't think we can consistently represent this with the process that Pavel uses above, though, any ideas?)
average time of pipeline to get get feedback on the MR
Would love to get some ideas from you for how we can surface the people involved - maybe the strength of the line is the number of MRs, while the color ranging from green to red the average number of comments between the author and assignees?
What if we visualize the top N MRs that took long time to get merged. Each MR could be broken down to the following categories:
Pipeline was running
Review iteration 1
Review iteration 2
Review iteration N
Idle time (not sure what goes here or we even need this, maybe the time between pipeline finished an first assignee action?, for example I create an MR and go grab a pizza...)
(Interesting side idea: pipeline can take long time to run, we could have something like delayed actions (checkbox), if pipeline succeeds, automatically assign it for review.)
How can we detect a review iteration?
Possible start events: New assignee added OR new code is pushed
Possible end events: approved OR merged OR previously added assignee removed OR new code is pushed
Additional condition:
If assignee is removed and there is at least 1 comment, then we can consider it as an iteration. (so we exclude accidental assignee change)
Here is an example chart (super simplified). Styling and more labels could be added to provide more context (number of comments, commit size, etc)
Breakdown of top 3 MRs answering "where the time was spent?"
@matejlatin - is there any chance you would still have time this milestone to create the UX for this pajamas component? Don't think it exists in our library.
A good point @djensen had was that we are probably not that interested in the quantity of the MRs which is what the graph mostly shows, but more on the time aspect. An example from one of our competitors is here, but it does seem like the most important aspect is just in the tooltip, i.e. how much are we waiting when different people are involved.
Another competitor does this more in this way, but I think this focuses too much on the person, while the focus should be on the process. Thoughts / ideas?
This sankey chart shows you relative contributions of your team, which can be valuable, so I'd propose we simply revise the "Problem to solve" to something like this:
Merge requests can involve any number of people as authors, commenters, reviewers and mergers. The larger the merge request, the harder it is to understand the relative contributions of the people involved. With this graph, we want to visualize who was most involved in the merge request process, and in what roles.
By focusing on the "contribution" aspect here, we can focus on the "time" aspect in a separate Issue.
I took some time to explore this and started off from this sentence in the issue description:
With this graph, we would like to understand whether there are particular assignees / reviewers (implied by the people that comment, merge and author), that usually drive longer times
We could certainly show this information with a sankey diagram, something like the following:
But I'm not sure if that's actionable enough. This mostly allows users to review the merge request process from the top level. How can they dig in, identify the problematic MRs and review the process to find out if something's wrong? How can we show the time spent in the diagram? How do we know how long an MR was waiting to get reviewed, how long it took to review etc.?
So I also explored something that could either be an alternative or something additional to the chart.
A table with MRs and info how long it took to merge, what the idle time was and what the review time.
The sorting options could be time to merge, idle time, review time...
We could also show all the details of the process, something like:
Btw, it's really hard to draw these sankey diagrams, once we establish what information we want to show with them, can we start FE work based on lo- or mid-fi mockups? I can then coordinate with the FE dev to make it look right. Otherwise, making these pixel-perfect will take time.
I actually like the Sankey diagram, @djensen because it can show if the review process is sound. I think ultimately, if everything is right, there shouldn't be anything further that a director wants to look at. But if you see that one person always sends all his requests to 1 person even though there should be a reviewer roulette, that might highlight something is broken in the process? If one person always takes a very long time to review everyone's requests, we want to also highlight - it might be the case that only that 1 person knows that part of the code, which is bad. I think we can actually make the color of each relationship be the time take to merge - so if there are a lot of requests from someone but all of his relationships are green, that's fine?
@matejlatin, I like the above, but it actually doesn't aggregate the data for us, which is what we are after with with sankey.
I think we can actually make the color of each relationship be the time take to merge
I like this idea because it speaks to things "that usually drive longer times".
One proposal: colorize the bands based on their subset of the time-to-merge rather than total time-to-merge. Otherwise I worry the chart will say things it doesn't mean to say. For example, consider a team with a reviewer who always moves fast, but an approver who always moves slow, so the MRs are slow. The band between the author and reviewer should obviously be green, but would be red if we measure everything by time-to-merge. To avoid false impressions we can measure each band by "time from X to Y" for that particular relationship. The downside is greater technical complexity, but I think it's necessary.
I think colorization could be a second iteration on this feature that really unlocks its potential.
Good point, on the front end, one of the sankeys already supports it, so we could embed that one.
One problem is that reviewers can only be assignees, as we don't have the concept of reviewer. So, I think we should skip this one and just have Author, Approver, and Merger (most of the time merger is actually approver). This however would not solve what you are asking for. If a reviewer is slow, we should expect to see it as a pattern across other folks, though, no?
@valexieva can't we assume that the assignee that's added to the MR besides the author is the reviewer? Btw, how do we know who's the approver? Once someone approves the MR?