Ensure consistent behavior across CI reports

Problem

Today the experience with dealing with CI reports is inconsistent:

👍 some reports (like Code Coverage) combines the data from multiple artifacts and store the result into a pipeline artifact. This is the correct way!
🐌 some reports are not persisted as pipeline artifacts. The report is generated all the time (using ReactiveCache to reduce performance impacts).
❌ some reports have built-in assumptions that the report must come from only 1 job within the pipeline. This is a wrong assumption because the artifact is a datapoint that contributes to the report. Multiple artifacts (from multiple jobs) can contribute to the report. If a pipeline has 2 SAST jobs running. This wrong assumption will show the report with data from 1 job only instead of combining the data into a single report.
- https://gitlab.com/gitlab-org/gitlab/-/blob/a13e153b7aef7e655948457f8a599e5e6658c6cc/ee/app/services/security/report_fetch_service.rb#L21-23
- https://gitlab.com/gitlab-org/gitlab/-/blob/67bae0efde93c103916989355fbfc8e7ecf5efcc/app/models/ci/pipeline.rb#L693
- Because of that browser_performance and load_performance only pick the first artifact of the given type in a pipeline. Any other artifacts of the same type are completely ignored.
For some report types we do server-side comparison between reports
For other report types we do client-side comparison between reports

There are also other problems related to the lack of abstractions: other domains leveraging the report feature deal directly with implementation details, making it difficult to refactor code or make changes.

Artifacts are a storage feature for data. They have a type associated to it as "metadata" so we can have dedicated strategies on how to interpret the data.
A build can generate multiple artifacts, but only 1 artifact of a give type per build.
A pipeline, by consequence, can have multiple artifacts of a given type as it can have multiple builds.
A report is parsed data combined from multiple artifacts of a given type.
- Because it's data, it could be a persistent pipeline artifact, automatically generated when the pipeline completes.
- The parser must consider that data may come from multiple artifacts across multiple builds.
- The report is produced for a given a pipeline. Trying to generate reports from a single or a collection of builds may produce incomplete data.

Here below is a scenario of a pipeline with multiple artifacts:

flowchart TD

Pipeline --> Job1 --> cqart1[CodeQuality 1/2]
Pipeline --> Job2 --> cqart2[CodeQuality 2/2]
Pipeline --> Job3 --> tr1[TestReport]
Pipeline --> child[Child pipeline]
child --> Job4
child --> Job5 --> ds[CodeCoverage]

The Code quality report should contain data from Job1 and Job2.
The test report contains data from Job3.
The Code coverage report, generated in Job5 should be available as MR report through the parent pipeline - Currently being done.

Among these 3 cases there should be a common abstraction.

Proposal

Introduce abstractions that can ensure a consistent behavior and expectation across all reports. The abstractions should hide implementation details, providing a consistent framework for reports.

class Ci::Reports::Factory
  REPORTS = {
    code_quality: ::Ci::Reports::CodeQuality,
    junit: ::Ci::Reports::Junit,
    # ...
  }.freeze

  COMPARERS = {
    code_quality: ::Gitlab::Ci::Reports::CodequalityReportsComparer,
    # ...
  }
 
  def self.report(type:, pipeline:)
    return unless pipeline
    return unless pipeline.pipeline_artifacts&.report_exists?(type)

    REPORTS.fetch(type).new(pipeline)
  end

  def self.compare(type:, base_pipeline:, head_pipeline:)
    base_report = report(type: type, pipeline: base_pipeline)
    head_report = report(type: type, pipeline: head_pipeline)

    COMPARERS.fetch(type).new.compare(base_report, head_report)
  end
end

class Ci::Reports::BaseReport
  def initialize(pipeline)
    @pipeline = pipeline
  end

  def report_type
    raise NotImplementedError
  end

  def parser
    raise NotImplementedError
  end

  def empty?
    data.empty?
  end

  private

  def data
    @data ||= collect_data || {}
  end
  
  def collect_data
    return unless pipeline
    return unless pipeline.pipeline_artifacts&.report_exists?(report_type)

    # read blob from pipeline artifact and parse it.
  end
end

class Ci::Reports::CodeQuality < Ci::Reports::BaseReport
  def report_type
    :code_quality # or use a method/constant from Ci::JobArtifact
  end

  def parser
    ::Gitlab::Ci::Parsers::Codequality::CodeClimate
  end
  
  def degradations_count
    data.count
  end

  def all_degradations
    data.values
  end
end

# Use as:
report = Ci::Reports::Factory.fabricate(type: :code_quality, pipeline: pipeline)

# instead of using `pipeline.has_code_quality_reports?`
if report
  report.data
else
  # none of the builds had any code quality artifacts
end

Reports comparison for merge requests could be extracted out of MergeRequest model and into a generic comparison engine.

comparison_report = Ci::Reports::Factory.compare(type: :code_quality, head_pipeline: head_pipeline, base_pipeline: base_pipeline)

Always ensure that when a pipeline completes we generate a report from merging all the artifacts:

after_transition any => ::Ci::Pipeline.completed_statuses do |pipeline|
  pipeline.run_after_commit do
    # For every artifact report type in the pipeline, generate a pipeline artifact
    # with merged data.
    ::Ci::Reports::GenerateReportsWorker.perform_async(pipeline.id)
  end
end

class Ci::Reports::GenerateReportsService
  def initialize(pipeline)
    @pipeline = pipeline
  end

  def execute
    types.each do |type|
      report = Ci::Reports::Factory.report(type: type, pipeline: pipeline)
      create_pipeline_artifact(report)
    end
  end

  def all_types
    # select all unique artifact report types in self and descendants
  end
end