Skip to content

Add GraphQL endpoint for code coverage

Max Orefice requested to merge mo-code-coverage-graphql into master

Ref: #231386 (closed)

What does this MR do?

Adds a GraphQL type & resolver to aggregate CodeCoverage data for a given project.

We are trying to build this table which contains the aggregation of our Ci::DailyBuildGroupReportResult.

image

This is the first time I'm building a GraphQL endpoint, so please let me know if I'm doing something wrong 🙏

Example

GraphQL request Response
{
  projects(ids: ["gid://gitlab/Project/22", "gid://gitlab/Project/21", "gid://gitlab/Project/20"]) {
    nodes {
      id
      name
      codeCoverageSummary {
        averageCoverage
        coverageCount
        lastUpdatedAt
      }
    }
  }
}
{
  "data": {
    "projects": {
      "nodes": [
        {
          "id": "gid://gitlab/Project/22",
          "name": "test-codequality",
          "codeCoverageSummary": {
            "averageCoverage": 82.35,
            "coverageCount": 2,
            "lastUpdatedAt": "2020-10-07"
          }
        },
        {
          "id": "gid://gitlab/Project/21",
          "name": "test",
          "codeCoverageSummary": {
            "averageCoverage": 77.7,
            "coverageCount": 1,
            "lastUpdatedAt": "2020-10-05"
          }
        },
        {
          "id": "gid://gitlab/Project/20",
          "name": "coverage-report",
          "codeCoverageSummary": null
        }
      ]
    }
  }
}

Database review

This MR introduced a new SQL query used by our new GraphQL endpoint:

WITH latest_by_project AS (
    SELECT
        ci_daily_build_group_report_results.project_id,
        MAX(date) AS date
    FROM
        ci_daily_build_group_report_results
    WHERE
        ci_daily_build_group_report_results.project_id IN (10463077, 278964, 3310437)
        AND ((data -> 'coverage') IS NOT NULL)
    GROUP BY
        ci_daily_build_group_report_results.project_id
)
SELECT
    ci_daily_build_group_report_results.project_id,
    AVG(cast(data ->> 'coverage' AS float)),
    COUNT(*),
    latest_by_project.date
FROM
    ci_daily_build_group_report_results
    JOIN latest_by_project ON ci_daily_build_group_report_results.date = latest_by_project.date
        AND ci_daily_build_group_report_results.project_id = latest_by_project.project_id
WHERE
    ci_daily_build_group_report_results.project_id IN (10463077, 278964, 3310437)
    AND ((data -> 'coverage') IS NOT NULL)
GROUP BY
    ci_daily_build_group_report_results.project_id,
    latest_by_project.date;

Here the new query ran on #database-lab for the 3 projects with the most records with its query plan.

🤖 Joe is suggesting the following:

Query processes too much data to return a relatively small number of rows. – Reduce data cardinality as early as possible during the execution, using one or several of the following techniques: new indexes, partitioning, query rewriting, denormalization. See the visualization of the plan to understand which plan nodes are the main bottlenecks.

Time: 33.144 s
  - planning: 0.857 ms
  - execution: 33.143 s
    - I/O read: 32.072 s
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 111389 (~870.20 MiB) from the buffer pool
  - reads: 27185 (~212.40 MiB) from the OS file cache, including disk I/O
  - dirtied: 384 (~3.00 MiB)
  - writes: 0

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Max Orefice

Merge request reports