Skip to content

Group-level DORA metrics API

Shinya Maeda requested to merge group-level-dora-metrics-rest-api into master

What does this MR do?

In the previous MR, we added project-level DORA metrics API support. This MR is to extend the capability to group-level. The API interface is same with project-level API. It simply extracts the subject projects from the given group and execute the same query for daily metrics.

Given that we re-designed this feature for resolving previous performance concerns, the new API improved the timing drastically. You can see the actual query performance below.

A few notes:

Manual QA

Metric: Deployment Frequency

Sample payload (Local)

> # Daily
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/groups/134/dora/metrics?metric=deployment_frequency&start_date=2021-03-01"
[{"2021-03-01":0},{"2021-03-02":1},{"2021-03-03":0},{"2021-03-04":0},{"2021-03-05":0},{"2021-03-06":0},{"2021-03-07":0},{"2021-03-08":0},{"2021-03-09":2}]

> # Monthly
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/groups/134/dora/metrics?metric=deployment_frequency&start_date=2021-03-01&interval=monthly"
[{"2021-03-01":3}]

> # All
> curl -H 'Private-token: XXX' "http://local.gitlab.test:8181/api/v4/groups/134/dora/metrics?metric=deployment_frequency&start_date=2021-03-01&interval=all"
3

Sample query

Conditions:

  • Date range: 3 months
  • Target group: https://gitlab.com/gitlab-org
  • Interval: daily
  • Executor: database-lab
  • Using warm cache for Group.all_projects in order to reduce the noise of slow I/O on database-lab. See more details in the slack discussion.

Dummy data insertion (because the data collection has just began and production doesn't have enough date)

DO
$do$
DECLARE
  d date;
BEGIN 
  FOR d in (select generate_series( '2020-12-01'::date, '2021-03-01', '1 day' )::date) LOOP
    INSERT INTO dora_daily_metrics (environment_id, date, deployment_frequency, lead_time_for_changes_in_seconds)
    VALUES (1178942, d, 10, 5);
  END LOOP;
END
$do$;

Query

SELECT date, SUM(deployment_frequency) AS data
FROM "dora_daily_metrics"
WHERE "dora_daily_metrics"."environment_id" IN (
  SELECT "environments"."id" FROM "environments" WHERE "environments"."project_id" IN (
    SELECT "projects"."id" FROM "projects" WHERE "projects"."namespace_id" IN (
      WITH RECURSIVE "base_and_descendants" AS (
        (SELECT "namespaces".* FROM "namespaces" WHERE "namespaces"."type" = 'Group' AND "namespaces"."id" = 9970)
        UNION
        (SELECT "namespaces".* FROM "namespaces", "base_and_descendants" WHERE "namespaces"."type" = 'Group' AND "namespaces"."parent_id" = "base_and_descendants"."id")
      )
      SELECT id FROM "base_and_descendants" AS "namespaces"
    )
  ) AND "environments"."tier" = 0 AND (tier IS NOT NULL)
) AND "dora_daily_metrics"."date" BETWEEN '2020-12-01' AND '2021-03-01'
GROUP BY "dora_daily_metrics"."date"
ORDER BY date ASC

Plan

Timing (w/ cold cache)

Time: 238.196 ms
  - planning: 4.799 ms
  - execution: 233.397 ms (estimated* for prod: 0.012...0.239 s)
    - I/O read: 227.539 ms
    - I/O write: N/A

Shared buffers:
  - hits: 927 (~7.20 MiB) from the buffer pool
  - reads: 240 (~1.90 MiB) from the OS file cache, including disk I/O
  - dirtied: 1 (~8.00 KiB)
  - writes: 0

A few notes:

  • While it takes 222.721 msec on the index scan with index_dora_daily_metrics_on_environment_id_and_date, it mostly spends time on I/O Timings: read=218.642. So the timing really depends on the I/O speed on the PostgreSQL instance. database-lab is typically having a slow I/O.

Timing (w/ warm cache)

Time: 6.843 ms
  - planning: 2.735 ms
  - execution: 4.108 ms
    - I/O read: N/A
    - I/O write: N/A

Shared buffers:
  - hits: 1292 (~10.10 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Bonus: with recursive_namespace_lookup_as_inner_join

plan

Timing (w/ warm cache)

Time: 6.880 ms
  - planning: 3.019 ms
  - execution: 3.861 ms
    - I/O read: N/A
    - I/O write: N/A

Shared buffers:
  - hits: 245 (~1.90 MiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Bonus: Query plan in production replica (not db-lab)

https://explain.depesz.com/s/LF1n

 Planning Time: 2.067 ms
 Execution Time: 23.167 ms

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by Mayra Cabrera

Merge request reports