Skip to content

[WIP] Replace ReactiveCache inline coverage loading with persistent file

What does this MR do?

This MR is a PoC to optimise the merge-request inline code coverage performance as a follow-up to !21791 (merged). Instead of parsing coverage files on the fly, this MR moves the coverage parse logic into a dedicated worker, which is triggered after pipeline completion. The parsed report is then stored in a new PipelineProcessedReport model as a file in object-store. Merge-requests fetch the file via ReactiveCache mechanism, frame the results (filter the files, which are shown in the merge-request) and cache it in Redis as part of the mechanism.

This change reduced the load-time of inline coverage information to <5 seconds (initial visit) or immediately (subsequent visits) for 200 MB of raw coverage data (GitLab produces around 20 MB), which is in steep contrast to the original performance measured here.

While the PoC is already working, there are still a couple of open topics. This MR is mainline intended to synchronise on a general direction.

/cc @rickywiens @ayufan

Closes #211410 (closed)

Open Topics

  • Expiration/Pruning: How and when shall processed-reports be pruned? I was thinking about referencing the model against the artefacts it is based on and prune the orphan reports as part of the artefact expiration process, hence it would inherit the expiration life-cycle of the longest surviving coverage artefact.
  • Migration/Recreation: The worker is only triggered for newly completed pipelines. How can a pre-processed report be created for existing pipelines? Either a background migration worker has to go trough all pipelines with reports to be pre-processed and do so (eager; might take a long while, could potentially fail for some reports) or pre-processed reports are created on demand, if a pipeline has no pre-processed report, but do has coverage data available (lazy; light on resources, resilient, nice fallback for other problems).
  • Error Handling: Since the parsing and delivery parts are now separated, error messages are no longer sent directly to the UI. How should parse-errors and other problems be displayed to the user? I would be in favour of adding an error field to the PipelineProcessedReport model, which the API could then deliver if required.
  • Re-use for other Reports: While I mainly target inline code coverage for this MR, it would also be suitable for other reports (jUnit?). How generic shall the worker, model, etc. be?
  • Schema Version: Since pre-processed files are stored persistently, they also survive version upgrades. How shall changes in the schema be handled? In theory, pre-processed reports can always be re-created. How shall version changes be tracked and when shall they be recreated?

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Merge request reports