Problem Validation: Report Source Lines of Code (SLoC) per Developer or per Repo/Group
Recommendation as a result of the research
- Do NOT implement Lines of Code Contributed per Developer. There has a potential negative impact.
- For details refer to the opportunity canvas or details section below.
- Analytics for Lines of Code per directory, repo, or group.
- While there are strong hints in this research that suggest this could be of value to users. It seems mostly a nice-to-have feature.
- For details refer to the opportunity canvas or details section below.
- The tentative effort estimation to implement SLoC count on the repo level suggests the amount of work is roughly 1.5 dev months. To then offer this on the group level is probably also not expensive. The effort to implement this on the directory level needs to be investigated.
- While there are strong hints in this research that suggest this could be of value to users. It seems mostly a nice-to-have feature.
Links
The full Opportunity Canvas "Contribution Metrics based on Lines of Code (SLoC) Contributed per Developer". Internal only.
The full Opportunity Canvas "Expose Lines of Code (SLoC) per Language in Repository Analytics to improve understanding of repo". Internal only.
The full research on Source Lines of Code Q3F23. Internal only.
Details: What we learned researching this space
Assumption |
What we thought |
What we learned |
What we think now |
{What we assumed} |
{What we thought this meant for the solution to this problem} |
{What we learned when we tested this assumption} |
{How these learnings relate to our approach/solution} |
We assumed that following three issues are about related/partly identical needs and are highly important due to their popularity: Report number of lines per language in repository charts 568 upvotes; 18 individual commenters + 15 customer asks; also third highest priority score for user requested issues in SCM.
New contributors graph (lines of code) 161 upvotes; 25 individual commenters; no customer asks Visualize Language Trends over Time 11 upvotes; 5 customer asks |
Due to the high popularity, we thought the community agrees that GitLab should implement ‘LoC’ and that it would be clear what to implement. The popularity (both among users and among paying customers) was the reason to pick this topic up. |
Reviewing all comments and customer asks on all issues I learned: There are many dimensions for which SLoC are requested: per instance, per group, per project, per user Paying customers care more about the group level, i.e. counting SLoC across all their repos In some cases, it seems unclear if there is a need for reporting SLoC per language or just SLoC There is little detail to why this is helpful nor how often such metrics would be consulted Additional insights from other sources: Some customers do NOT want to see data about individuals as their workscouncil would object that and would turn off such a feature. |
It seems that there are two fundamental use cases: a) report LoC per user (i.e per each contributor to a project) b) report LoC per ‘location’ (i.e. per repo, per group or per instance) It is not very clear how important the topic is to the users/customers. - Conclusion: we should run a survey to assess importance. It is not clear what they are trying to achieve. - Conclusion: run interviews to understand the why. |
The ask is highly popular AND GitHub has offered this for a long time. It should therefore be a good thing to implement ‘LoC’. |
I thought GH addresses all asks vocalized in our issues. |
Assessing <strong data-sourcepos="123:32-123:61"><span dir="">GitHub</span></strong><span dir="">’s offering</span> I found: GH does NOT show LoC per repo. (GL also doesn’t.) GH shows languages per repo - in percent in the UI (just like GL) - in bytes in the API (unlike GL which also provides percent) GH also offers the above for the org-level, i.e. is across all repos. (GL doesn’t.) GH shows LoC (added/removed) per contributor (but NOT language). A popular tool lets users post their personal stats on their profile page. |
GitHub also does not offer everything that users seem to be asking on our issues. A reason may be that when they implemented this there was no handy library to report SLoC per language per repo/org. GH seems to be considering extensive visualizations of source code. The feedback on twitter is positive. If we were to offer LoC per user, this would be a me-too feature. The comments on the issue suggest that the need is largely driven by the fact that GitHub has had this for a long time and developers consider it as something a source code management platform would normally have. |
We should survey a broad spectrum of users as the issues suggested that commenters also came from different backgrounds but still all seemed to care about the topic ‘LoC’. We asked a total of 118 users: 64 sw dev; 15 dev team leads; 10 DevOps; etc. |
We thought ‘LoC’ would be highly popular, given that the issues are so popular. |
Compared to users of other tools, users of GitLab seem to care less about seeing the number of lines contributed GitLab users also care less about the breakdown of languages within a project compared to their peers In most cases, those newer to their role thought the features were more important
|
The survey suggests, ‘LoC’ seems to be of medium importance (or medium un-importance) for Source Code management. We will need to run interviews to understand more. |
The survey suggested that there do not seem to be significant differences in the views of managers vs. individual contributors. Therefore, we interviewed software developers and also asked them about their manager's perspective. 8 interviews (7 sw devs, 1 of which is also DevOps engineer / evangelist; 1 prj. mngr (former sw dev)) |
We would learn in the interviews what the value of ‘LoC’ would be. |
In the interviews we learned: SLoC per user is NOT seen as a good measure for contribution. Some have had negative experiences with metrics to assess performance (incl. SLoC) SLoC _per directory_ would be valuable. Interviewees see this mostly as nice to have though. Comparing growth of SLoC in different directories that serve different purposes (application vs. test) is more interesting than comparing growth in SLoC per languages. Potentially interesting sub-feature: Using SLoC to filter by language: commit lists or repo lists. |
One use case seems to be around SLoC _per user:_ _I want an easy way to know who has contributed how many LoC to a project_ _so that I can compare their contributions_ _a) to feel proud of my own work or_ _b) to track my team members work_ We should NOT implement this use case: SLoC _per user_ as it is NOT a good metric. It might be a nice graph to look at for some, but it will have negative consequences for a few as their performance would be evaluated based on their contributed SLoC. Do literature research to find further evidence of this outcome. Consider a blog post that reflects our decision. A good timing would be at the same time when we potentially release something else related to SLoC (see below). -> Take this result and create an **opportunity canvas on SLoC _per user _**as a contribution metric with the recommendation NOT to do it. —-- SLoC per directory or _per language per repo_ seems to be valuable to help users get a quick understanding of a repo or the health of different modules. The use case seems to be: _For a given repo or a module (i.e. directory) , that I see for the first time, I want to know lines of code per language,_ _a) so that I can get a quick sense if it will be easy for me to contribute to it given my skill set_ _b) so that I can understand if the module is bloated_ This addresses the needs of individual contributors so it would be part of the free tier. -> Understand the effort to implement this. —--- SLoC _per group_ was not assessed in this series of interviews with developers as this is likely more relevant for administrators, CIO’s, directors, etc. -> reach out to 2 or 3 customers that requested this in one of the original issues to understand what they are trying to achieve and to understand how this could be tiered. |
The <strong data-sourcepos="259:26-259:68"><span dir="">literature research</span></strong> showed: Literature supports the interviewee’s perspective that SLoC is NOT a good metric for measuring contribution. |
Definition of Done
-
The problem is well understood by the PM to have an understanding summarized in a RICE score (see Opportunity Canvas "Contribution Metrics based on Lines of Code (SLoC) Contributed per Developer"). -
The problem is well understood by the PM to decide if they want to move forward with this idea or drop it. - Recommendation is not to go ahead with Lines of Code per developer
- Recommendation is to do follow-up research on Lines of Code per directory, per repo, and per group
-
N/A: The problem is well described and detailed with necessary requirements for product design to understand the problem -
N/A: The problem is well described and detailed with necessary requirements for engineering to understand the problem
Research Issue
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.