This is the latest ask from Sid in the Engineering/PM meeting on the next iteration of the previous MR per Author per Month.
This issue is to prep for the metrics and all the switching cost associate with it. Likely that this will replace the legacy metric when this is ready.
Requirements
The metric should focus on the MRs that are part of the product that we release to customers.
Include: Development, quality, UX, PM (excluding VP and directors)
Exclude: Support, Security, Infra (seen as external contributors in this case.
Include community contributions MRs, no need to capture the Comunity Contribution authors in. Getting that in is still work on our side
Count toward MRs, but not towards the headcount
Remove unique authors criteria, we count on who we are paying, even if a person is on vacation or sick we still count as the numerator
I think we should do pre-planning here and create issues in the data team's project directly. I think we are almost ready to go except for ironing out the data from BambooHR?
I think we still need to do an audit of the list of repositories. Projects that are not considered customer-facing should be removed.
@meks The chart is not finalized, my previous message was more meant as a sneak peek because I managed to use the data from BambooHR instead of the unique authors. I'll update the description with the current status of the chart to make that clear.
The chart looks a bit funky, I am not sure why are we seeing 0.0x MRs prior to February 2018.
Note that it's not 0.0 since there are very fews MRs for these dates. For instance for February 2018, there are 2 MRs merged (same for January 2018)! It seems we have very few data prior to February 2018. We'll have to ask the data team.
Consider non-customer facing work in triage-ops, release, QA and etc.
@meks This isn't clear, the description states "The metric should be cost based and focus on the MRs that are customer facing." but here we are saying that we should also consider some non-customer facing projects? Is that correct?
Clarified that: we are speaking of gitlab-triage and gitlab-qa which are projects used by our customers. We should only consider customer-facing/customer-used projects.
@meks@clefelhocz1 I've added a few columns to help identify projects that are user-facing more easily: Name, Visibility, Archived?, Created at, Last activity at, Forks count, Stars count. I've also added sorting and changed the checkbox to a YES/NO select with no value by default so that we know if an explicit decision was made for a project, compared to checkboxes which defaults to "NO" when they're unchecked.
Lastly, I've marked all private projects as non-customer facing, I think that makes sense? I've also marked some projects as "YES" and "NO" based on the criteria defined in the requirements:
gitlab-org/charts/auto-deploy-app is customer facing - it's the chart all Auto DevOps pipelines use by default to deploy. I have marked it so on the spreadsheet
@rymai I removed gitlab-org/gitlab-triage. This is a bit of a special case since this is not officially in the product even through it's free and some customers are using it in some form. We would want to incentivize building it in the product. We should position ourselves such that the DogfoodingRebuild in GitLab version is accounted for instead.
@meks It appears the project IDs doesn't correspond to the project's paths in the spreadsheet anymore (since August 22nd). I probably messed up the sorting. I'll try to fix that and avoid loosing the filtering that was already made.
@mendeni I will take care of adding the project, thanks!
@meks in Secure we are maintaining a lot of tests/QA projects to validate the integration of the 3rd party tools we leverage.
I personally think these projects should be included in the metrics (today they are part of our throughput) as the time spent on maintaining them is strongly related to the associated features.
Do we agree on including them even if they are not directly contributing to customer-facing services?
I think this is a reasonable ask since the tests for CE/EE are in the same repo as the code. Would it be ok if I schedule a coffee chat to go through the list from Secure a bit more?
@meks - we generated a list of projects that were customer facing a little under a month ago, however I'm not seeing a large chunk of those we identified within that spreadsheet. Is there a reason for their omission?
- I might be missing something here but I am seeing most of those projects in the Please use this sheet! tab. I pulled out the projects from your document and built out the Security Projects tab which lists all of those projects and the row on the first sheet in column D.
Thanks @kwiebers. I'm guilty of making an assumption, namely that the sort order on that spreadsheet was by project path, which is obviously not the case. I'll review more thoroughly later today.
I think that was a valid assumption to make. I was guilty of that myself initially. I'm not completely sure what the sort order is.
Let me know if there's anything I can do to help work through the security projects. If it would be helpful, we can try to identify the owning group/subgroup for all projects so you can filter down to the security-products repos.
The sort was "Customer facing / Part of the product ?", but you can sort by any other column!
@twoodham I've updated the list in the spreadsheet based on your list, and I also marked gitlab-org/security-products/container-scanning as customer facing since it seems to be similar to other *-scanning projects?
Thanks @gonzoyumo - I was doing a final review of the projects and wanted to confirm the inclusion decision on a few specific projects:
gitlab-org/security-products/analyzers/gemnasium-fork - marked as No and seems to be an older version of gitlab-org/security-products/analyzers/gemnasium
gitlab-org/security-products/tests/damn-vulnerable-spring-boot-app - marked as No and seems to be a fork of https://gitlab.com/kiview/damn-vulnerable-spring-boot-app for testing purposes.
@meks It's not obvious what is the question this new chart is supposed to answer.
Could we try to first define what is this question, why we'd need such chart, who will be using it etc.? The data team has a great issue template for requesting a new visualization/dashboard (see below), we could try to answer these questions first?
Measure release post items as well- as proxy of measuring value delivered to customers.
For reference, following is the Data Team issue template for requesting a new visualization/dashboard:
<!---This issue is for visualization related issues within our BI tool.---->#### What is the business question you are trying to answer? Example: Is there a relationship between the day of the week that deals close and the ability of the account manager to upsell them in the first month? ##### What is the impact of this question and how will it help the company?#### Please link to where this (or these) performance indicator/s are defined in the handbook. Everything needs to be defined in the handbook.#### Who will be using this data?Example: SDRs need this to better understand X or VP of Product needs this to do Y#### What time frames are crucial here? Example: I would like to look at performance by month, but at the trends over the last year.#### Will this deliver business value within 90 days?If not, consider why you want this data.#### What is the visualization you are trying to create?Include any links or screenshots, if appropriate. As a rule of the thumb, the analytics team uses 12 visualization types. They are:1. Simple Text2. Table3. Heat map (Table with Conditional Highlighting)4. Scatterplot5. Line graph6. Slope graph7. Vertical bar chart8. Stacked vertical bar chart9. Horizontal bar chart10. Stacked horizontal bar chart11. Waterfall chart12. Square area chart#### What is the source of data behind the visualization?SFDC? ZenDesk? There may be more than one. #### What interactions/drill downs are required?Example: I'd like to be able to dig into the specific opportunity details (owner, account owner, IACV). I'd also like to be able to filter by region. #### Any caveats or details that may be helpful?
We discussed this during the team meeting and 1:1 and I wanted close the loop here.
It's not obvious what is the question this new chart is supposed to answer.
Could we try to first define what is this question, why we'd need such chart, who will be using it etc.?
We emphasize on the MRs in the projects that we ship to customers. Hence using the term; ship, part of the product.
We should be incentivizing dogfooding and build into the product first. Hence being more limited on the number of projects included.
We want to remove the calculation fluctuation with staffing. The current method looks at unique authors. The new method counts all the MRs contributed by everyone but with a stable denominator with UX / PM / Development / Quality.
All in all, this is an improvement to the existing accounting method, with more focus and a more stable denominator.
@meks@clefelhocz1 A small suggestion regarding the description of this chart, I suggest "Average merged MRs in user-facing projects per person building the product, per month".
"user-facing" because "customer-facing" could alienate the Core users.
"person building the product" because that describes well the set of people we're selecting.
Maybe we can even find a better/shorter name like "Product-builders productivity index" or something like that. The benefit would be that you would need to look at the explanation attached to the chart to understand what are the data used in it, instead of assuming things like "this chart includes all the people from the Engineering function", or "this chart includes all the gitlab-org projects" etc.
For the 2nd chart ... is this for gitlab-org group as a whole not filtered by project list?
@meks Checking the queries behind them, the only difference is the condition for merged_mrs (affecting "Average Mrs Per Person"). For the first one it's is_included_in_engineering_metrics = True and the second one is namespace_id IN (9970), and 9970 is indeed gitlab-org.
On the other hand, people_expected_to_contribute is the same in both charts. This makes me wonder if that makes sense? They're both based on Department, but I would expect if we're counting people from the whole group, we should also change the expectation to all the members within the group, not just a subset of the group.
In the current queries, the second chart will always be lower than the first one. It seems to me it's only telling the ratio of expected authors in the group. (i.e. expected number of people / gitlab-org) if we're comparing the two charts.
First chart: a / c
Second chart: b / c
So comparing them is telling us the difference between a and b.
@rymai What do you think? Please correct me where I am wrong.
On the other hand, people_expected_to_contribute is the same in both charts. This makes me wonder if that makes sense? They're both based on Department, but I would expect if we're counting people from the whole group, we should also change the expectation to all the members within the group, not just a subset of the group.
@godfat The second chart is only for exploration purposes, it doesn't comply with the requirements from this issue. It's just an experiment because I was curious if filtering specific projects compared to taking all gitlab-org projects would show different numbers (and it does, but not in the way I expected to). The argument about the "people expected to contribute" not reflecting the members of the group can also be used for the first chart where we only select a subset of projects (I would even argue the argument is stronger in that case because the number of filtered projects will always be smaller than the number of projects in the gitlab-org group).
In the current queries, the second chart will always be lower than the first one.
Yeah, and that's unexpected to me because the number of filtered projects should be smaller than the number of projects in the gitlab-org group, so the number of MRs should also be smaller, and given that the number of "people expected to contribute" is the same in both charts, the first one should show lower data, but it doesn't, so I'm think the second chart doesn't return all the gitlab-org's projects somehow. Indeed, I now fixed the query to use ultimate_parent_id instead of namespace_id to also include nested projects! The numbers are much more inline now.
I am going through the sheet again and triple checking the list. Also ran this by Stan today on our 1:1
Cool, once we're ready, I think we should add a new column in the DBT gitlab_dotcom_merge_requests_xf model, similarly to the current IS_INCLUDED_IN_ENGINEERING_METRICS column (that we should keep for back-compat purpose obviously). I think IS_CUSTOMER_FACING could be a good name for this column.
Can we clarify what is Average MRs per Person vs Legacy Average MRs per person ?
There are notes at the right of the chart that I hoped would explain that. :)
Does legacy in ths sense mean unique authors for comparison? If so let's call it as such Legacy per author
Yes, I only included this data in the chart for comparison (remember we are still working in the Sandbox dashboard here), but we shouldn't include it in the final chart. I'll just remove it from the chart because this is confusing otherwise.
The new population should be called per Product team member not per person so it's crisp from the get go as well.
Define and publish guidance for requesting change to project inclusion
Inclusion and exclusion, if projects have merged, refactored so projects are simplified. We want to encourage Dogfooding and building as part of the product so the first option should be to add to the existing projects which are already part of the product.
add explicit CUSTOMER_FACING indicator
Do we feel that CUSTOMER_FACING is clear. How about PART_OF_PRODUCT since the name of this chart is Product MRs...
The rest sounds good to me. Thanks for driving the effort!
@sethgitlab - Thanks for the feedback! We do want to be careful about having multiple sources of similar information and need to look towards consolidating where we can.
The intent with this spreadsheet was not to pull together a collection of all projects but to identify which products were included with the GitLab product to use for use in https://app.periscopedata.com/app/gitlab/496118/Engineering-Productivity-Sandbox. As @meks mentioned above, we want to try to influence behavior to include in in the product and Dogfood over creating a new project.
We want to encourage Dogfooding and building as part of the product so the first option should be to add to the existing projects which are already part of the product.
I meant that there are some projects listed which are set to "No" in Column K. For example, gitlab-triage and projects on gitlab-com namespace. Due to this I am not sure what we will be able to do with populating the projects.yml file from this spreadsheet (or the future file and process around managing the projects that are part of the product).
marked the checklist item Factorize the projects list in a macro and expose it as a is_part_of_product column in the gitlab_dotcom_issues_xf and gitlab_dotcom_merge_requests_xf models: https://gitlab.com/gitlab-data/analytics/merge_requests/1739 as completed
A target that everyone agrees the team is capable of delivering on
We should probably clarify who's "everyone", and who's "the team" here? :)
An explanation why the target makes sense.
I would say 8 for now since the September number was 6.52, and the August number was 6.43. The last time the number was above 8 was in April 2019 with 8.89.
@kwiebers Could we please close this issue and open a new one (in https://gitlab.com/gitlab-org/quality/team-tasks/issues) because finding the correct target could take a bit of time, and I think the implementation of the "New accounting method per customer facing work team member per month", which was the original goal of this issue has been done now. Let's not keep adding action items to this issue. :)