Reduce code complexity in the Project model
In https://gitlab.com/gitlab-org/release/framework/issues/28 I dug up quite a bit of data about code quality, changes per files, etc. Unfortunately, while we ended up with quite a bit of data it was not clear how to proceed. Part of this is due to the big scope of the issue, instead of focusing on specific files.
With this issue I want to take a slightly different approach: reducing the code complexity of just the Project model. This model is one of our most important models, and in almost every statistic I found it scored the worst. In https://gitlab.com/gitlab-org/release/framework/issues/28#note_120498827 I also gathered a rough estimate of the number of bugs per file, in which the Project model scored the highest (= most bugs).
One statistic we could use is cognitive complexity. Cognitive complexity is a way of measuring how difficult it is to understand code. The basic idea is that the more breaks there are in the natural flow (from top to bottom), the more difficult the code is. An example:
def after_create_default_branch
return unless default_branch
# Ensure HEAD points to the default branch in case it is not master
change_head(default_branch)
if Gitlab::CurrentSettings.default_branch_protection != Gitlab::Access::PROTECTION_NONE && !ProtectedBranch.protected?(self, default_branch)
params = {
name: default_branch,
push_access_levels_attributes: [{
access_level: Gitlab::CurrentSettings.default_branch_protection == Gitlab::Access::PROTECTION_DEV_CAN_PUSH ? Gitlab::Access::DEVELOPER : Gitlab::Access::MAINTAINER
}],
merge_access_levels_attributes: [{
access_level: Gitlab::CurrentSettings.default_branch_protection == Gitlab::Access::PROTECTION_DEV_CAN_MERGE ? Gitlab::Access::DEVELOPER : Gitlab::Access::MAINTAINER
}]
}
ProtectedBranches::CreateService.new(self, creator, params).execute(skip_authorization: true)
end
end
This code is a prime example of a method that breaks the natural flow all over the place. Unsurprisingly, it scores poorly on Code Climate with a cognitive complexity score of 10, with 5 being that maximum allowed.
The goal for this issue is to start reducing the cognitive/cyclomatic complexity to 5. I'm picking this number for two reasons:
-
It's the score enforced by Code Climate, making it a bit harder to measure/compare things.
-
Per https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two, it appears the human brain can hold 7 ±2 objects in working memory. If we interpret branches as objects in this sense, it means humans can handle between 5 and 9 branches in the control flow. Because of this range, 5 is the safest assumption
My current plan is the following:
-
For all methods in
Project, gather the cognitive/cyclomatic complexity score -
For all methods in
Project, gather the names of current GitLab employees who changed the method in the last year -
Randomly pick a bunch of these employees, and have them share their findings on a handful of randomly picked methods. This would be used to verify if the cognitive complexity scores make sense, and what thresholds we may need to enforce
-
Refactor 1-3 methods to have a score of <= 5, then have them re-reviewed by the same people. This should be done using a two step process:
- Reduce the cognitive/cyclomatic complexity
- Move the logic out of the Project model if possible
-
Write a RuboCop cop that enforces this. As far as I can tell, its PerceivedComplexity cop uses its own algorithm, and it's not clear what research this is based on (if it's based on anything in the first place).
The end goal is to achieve two things:
-
Obtain a better understanding of how our developers perceive complexity
-
Reduce cognitive/cyclomatic complexity, thereby making it easier to work with the code. This in turn should make it easier to ship changes.
Goal per January 11th, 2019: increase the Code Climate score from an F (lowest) to an A or B.