Determine how and where to count lines of code per language efficiently
Background
This is to support a business decision regarding Report number of lines per language in reposito... (&8589). We are still gathering data to understand if we should do this at all. This spike is one input: We need to understand how complex it would be to enable the following use cases and if there are critical operational costs that we should be aware of. Please, note there is a lot of context also in https://gitlab.com/gitlab-org/gitlab/-/issues/371465+, which is likely helpful when working on this spike.
We will have the following engineering efforts:
- UX design
- Frontend work
- Backend work
@tlinz: I dare to assume that we can simply estimate the effort for UX design and frontend work with a SWAG. The backend work is likely more complex, which is why we do this spike. For instance, we need to understand if we do this in Gitaly or in Rails and what this implies.
Use cases
Note: the following use cases might still change somewhat as we mature the Problem Validation: Report Source Lines of Code... (#371038 - closed)
Sasha (Software Developer), Priyanka (Platform Engineer), Simone (Software Engineer in Test) have the following problem:
- I want to know how much I and others have contributed to a project so that I can compare my contribution to that of others and feel good about it.
- I want to see how my and others' contributions to a project have changed over time so that I can compare my contribution to that of others and feel good about my persistence.
- lines added and removed
- chart over time
- I want to see to which projects I have contributed and how much in which language so that I can easily prove to potential employers that I have experience in writing code in certain languages.
- I want to know the total lines of code and change of lines of code over time (and language) in a group, project, directory (or file), so that I can get a sense of the health of a module or I can get a sense of a repo as such, e.g. by
- comparing the size of the directory containing the tests with the directory containing the application code; or
- seeing if a module (i.e. typically in a directory) becomes too large so that it would make sense to break it up; or
- seeing the language distribution of a project or module, gives me a quick understanding if I can or cannot contribute given my language skills.
Sam (Security Analyst), Delaney (Development Team Lead), Cameron (Compliance Manager) Sidney (Systems Administrator) have these problems:
- I need to know which languages are used to what extent in a project or group so that I can assess risks, make tooling decisions, make hiring decisions, and confirm compliance with company rules.
Non-functional requirements: these metrics are not needed in real-time. A daily update is probably sufficient.
Scope
This a research spike with backend-weight2
Goal
Determine if we can accurately and performantly count lines of code in either:
- Gitaly
- Rails
Please create follow-up issues to develop the line-counting function and if possible apply a weight.
Please, consider the implications on operational cost when designing the solution. At least some of the functions are likely to going to be free.