CodeQuality report should not use merged results pipeline report for comparison
Problem to solve
Currently, the Code Quality report diff can change based on the state of the branch and whether or not it has been rebased, due to the fact that the target branch report used for the comparison is based on the merged result and not the latest commit on the target branch. We should change this so that the target branch report uses the latest commit instead of the merged result.
Intended users
Further details
Detached Merge Request, i.e. commit-on-branch, pipelines
graph TD
A["Commit A (master)"] --> B["Commit B (master, A+B, BASE_PIPELINE)"]
B --> C["Commit C (master, A+B+C)"]
B --> F["Commit F (feature, A+B+F, HEAD_PIPELINE)"]
C --> D["Commit D (master, A+B+C+D)"]
D --> E["Commit E (master, A+B+C+D+E)"]
Here, the comparison is clear between the before and after state of the commits in the merge request. The reports are out of date against current master (not including C
/D
/E
), but consistent and stable for the merge request's context.
Merge Request Pipelines (project state is a combination of both branches)
graph TD
A["Commit A (master)"] --> B["Commit B (master, A+B, BASE_PIPELINE)"]
B --> C["Commit C (master, A+B+C)"]
B --> F["Commit F (feature, A+B+F)"]
C --> D["Commit D (master, A+B+C+D)"]
F --> G["Commit Z (Ephemeral merge commit, A+B+C+D+F, HEAD_PIPELINE)"]
D --> E["Commit E (master, A+B+C+D+E)"]
D--> G
Here, the reports include newer information from master, which unintentionally makes things worse because our references to "before" and "after" the merge request are not in line with the actual states of the project as base_pipeline
and head_pipeline
.
Options
To produce accurate reports:
-
base_pipeline
should be pointing to commitD
OR -
head_pipeline
needs to point to commitF
Option 1 has the benefit of including more recent information, and will be less out of date when they're first opened, as every MR pipeline creates a fresh merge commit with the current state of the target branch. However I did include commit E
in this diagram to illustrate that in a large and/or fast-moving project (GitLab), it's likely that these widgets will become out-of-date relatively quickly. They'll also be very likely to change with every single new MR pipeline. Widgets can be refreshed by a simple click to run a new merge request pipeline. Because changing base_pipeline
to point to a new location would have effects beyond just the code quality widget, we would likely want to introduce a new reference to commit D
and refer to that in the code quality comparison specifically.
Option 2 has the benefit of working in the exact same way detached merge request pipelines do. The meaning of a merge request diff widget will be constant throughout every configuration of GitLab CI. To update stale widget data, the merge request will need to be rebased on the target branch, triggering a new merge request pipeline. Because the typical implementation of Merged Result Pipelines doesn't run the branch pipeline when the merged result pipeline is ran, we would have to change the code quality job to checkout commit F
and generate the report based on that.
Proposal
Because the branch pipeline likely doesn't exist, option 2 is more difficult than previously thought. We should take Option 1 above in order to fix this. It has the lowest impact and highest upside potential. We should bear in mind that we likely also want to make this change for all grouptesting widgets, including:
- Accessibility
- Browser Performance
- Load testing
- Code coverage
So while making this change for CodeQuality we should look for reusability as well. It is likely that other widgets will also use this solution if we take Option 1
Further Implementation Details
CodeQuality
, BrowserPerformance
, and LoadPerformance
are all implemented the same way in MergeRequestWidgetEntity
, so I can confidently say that those are affected.
The security scanning widgets seem to suffer from this differently. They use MergeRequest#compare_reports
:
def compare_reports(service_class, current_user = nil, report_type = nil )
with_reactive_cache(service_class.name, current_user&.id, report_type) do |data|
unless service_class.new(project, current_user, id: id, report_type: report_type)
.latest?(base_pipeline, actual_head_pipeline, data)
raise InvalidateReactiveCache
end
data
end || { status: :parsing }
end
which uses MergeRequest#actual_head_pipeline
:
# Use this method whenever you need to make sure the head_pipeline is synced with the
# branch head commit, for example checking if a merge request can be merged.
# For more information check: https://gitlab.com/gitlab-org/gitlab-foss/issues/40004
def actual_head_pipeline
head_pipeline&.matches_sha_or_source_sha?(diff_head_sha) ? head_pipeline : nil
end
which appears to return either head_pipeline
(which we now know to contain changes from target_branch
) or nil. So I think once actual_head_pipeline
returns nil, we render the same report out of ReactiveCache forever. I'd have to investigate this more to confirm this behavior.
Permissions and Security
N/A
Documentation
We should explain the comparison in the documentation, likely including the diagrams above. We should find the appropriate place to explain the comparison, it may be that it belongs in the contributor documentation.
Availability & Testing
What does success look like, and how can we measure that?
What is the type of buyer?
Is this a cross-stage feature?
If this works to improve relevancy of the widget content, this approach may be taken for other widgets. You can see a larger discussion about the relevancy of widget data here: &4006 (closed)