Combined mean and median in some graphs
Some of the graphs we tried to fix in #545 (closed) show statistical values that have not been possible to obtain. These graphs are the following:
-
Time to close (Reviews closed)
in the Performance section: -
Mean and median duration (days) of all closed reviews
in the CHAOSS section:
These graphs should show the mean and median time to close of the closed and merged reviews. In the case of GitLab repositories it has not been possible to obtain them. This is because GrimoireLab makes a distinction between closed and merged reviews (for GitLab), making closed (rejected) ones present the closed_at
field with information and the merged_at
field empty. Instead, merged reviews present an empty closed_at
field and the merged_at
field with information. Because of this:
- We cannot perform a date histogram aggregation using these two date fields for it, which makes it impossible to calculate statistics on a single data set.
- If we generate two different histograms, one for the closed reviews and the other for the merged ones, we could calculate the combined mean, like it is explained here. But the combined median seems to be impossible to obtain from the two datasets.
For this problem we propose the following solutions:
- Modify GrimoireLab for GitLab: create a new field to unify the
closed_at
and themerged_at
field. This new field would be common for closed and merged reviews. - Modify GrimoireLab for GitLab: make the field
closed_at
to have the same value than the fieldmerged_at
when a merge is done. - Eliminate the median of these two graphs. This would be our last option if the previous ones fail.
Edited by Sergio Merino Hernández