Skip to content

New Cycle Analytics Backend

Adam Hegyi requested to merge new-cycle-analytics-stage-backend into master

New Cycle Analytics Backend

Related issue: https://gitlab.com/gitlab-org/gitlab-ee/issues/12196

What is Cycle Analytics

  • TLDR: Find all the Issue or MergeRequest records matching with a date range query (start_event and end_event) = Stage.
  • Calculate the duration (end_event_time - start_event_time)
  • Extract the median duration
  • Extract the list of records relevant to the date range

This change lays the foundation for customizable Cycle Analytics stages. The main reason for the change is to extract the event definitions to separate objects (start_event, end_event) so that it could be easily customized later on. (In EE we'd have about 20 new events)

This MR ensures that we have the right building blocks in place before implementing the customizable stages feature in EE and addresses some of the performance issues regarding to the feature.

To avoid having different implementation in CE and EE, the default cycle analytics stages that we provide (implemented as community contribution) are also implemented with the new structure.

High Level Overview

  • DataCollector is responsible to build the base query, join the absolutely necessary tables and do high level filtering.
  • An Event (start, end) could alter the query (join additional tables when needed). It defines a timestamp expression that will be used for the duration calculation.
  • Median and Records are using the base query provided by the DataCollector and do additional query manipulation.
  Defined for a Group (EE) or for a Project (CE)

  +---------------------+
  | Stages              |
  |   +-------------+   |
  |   |   Stage A   |   |                                 +--------+
  |   |             |   |      +---------------+   +----> | Median |
  |   +-------------+   |      |               |   |      +--------+
  |   | Start Event | +------> | DataCollector | +-+
  |   +-------------+   |      |               |   |      +---------+
  |   | End Event   |   |      +---------------+   +----> | Records |
  |   +-------------+   |                                 +---------+
  |                     |
  |   +-------------+   |
  |   |   Stage B   |   |
  |   |             |   |
  |   +-------------+   |
  |   | Start Event |   |
  |   +-------------+   |
  |   | End Event   |   |
  |   +-------------+   |
  |                     |
  |   ...               |
  +---------------------+

In the Code

Review stage definition:

  • Start: merge request created (merge_requests.created_at column)
  • End: merge request merged (merge_request_metrics.merged_at column)
stage = CycleAnalytics::ProjectStage.new(start_event_identifier: :merge_request_created, end_event_identifier: :merge_request_merged, project_id: 19)

# each supported event is represented as a Class
# stage.start_event.class => Gitlab::CycleAnalytics::StageEvents::MergeRequestCreated
# stage.end_event.class => stage.end_event.class => Gitlab::CycleAnalytics::StageEvents::MergeRequestMerged

dc = Gitlab::CycleAnalytics::DataCollector.new(stage, from: 30.days.ago, user: User.first)
puts dc.median.seconds
puts dc.record_fetcher.serialized_records

Database Related Changes

The underlying query has been changed a bit compared to the previous Cycle Analytics implementation:

  • Using percentile_disc function for the median calculation
  • Not using CTE tables
  • Avoid unnecessary table joins (join tables when the Stage actually requires it)
  • Introducing cycle_analytics_project_stages and cycle_analytics_group_stages tables to store the customized stages. Please note that in CE we won't persist anything (only in-memory models will be provided).
  • Loading records with one query (IssuableFinder query is joined), which helps implementing pagination later

Issue Stage

Old

WITH "cte_table_for_issue" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at")-COALESCE("issues"."created_at") AS issue
   FROM "issues"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.419650'
     AND ("issue_metrics"."first_added_to_board_at" IS NOT NULL
          OR "issue_metrics"."first_associated_with_milestone_at" IS NOT NULL)),
     "ordered_records" AS
  (SELECT "cte_table_for_issue"."issue" AS issue,
          row_number() OVER (
                             ORDER BY "cte_table_for_issue"."issue") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_issue"
      WHERE "cte_table_for_issue"."issue" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_issue"
   WHERE "cte_table_for_issue"."issue" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."issue")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

Plan:

 Aggregate  (cost=2813.66..2813.67 rows=1 width=8) (actual time=1911.900..1911.900 rows=1 loops=1)
   Buffers: shared dirtied=209 hit=2191 read=1443
   CTE cte_table_for_issue
     ->  Nested Loop  (cost=1.99..2792.62 rows=274 width=54) (actual time=61.701..1908.090 rows=630 loops=1)
           Buffers: shared dirtied=209 hit=2186 read=1443
           ->  Nested Loop  (cost=1.56..101.05 rows=608 width=50) (actual time=36.755..448.237 rows=713 loops=1)
                 Buffers: shared dirtied=97 hit=466 read=305
                 ->  Nested Loop  (cost=1.00..9.04 rows=1 width=38) (actual time=21.772..22.550 rows=1 loops=1)
                       Buffers: shared dirtied=2 hit=1 read=10
                       ->  Index Scan using projects_pkey on public.projects  (cost=0.43..4.45 rows=1 width=21) (actual time=0.411..1.186 rows=1 loops=1)
                             Index Cond: (projects.id = $PROJECT_ID)
                             Buffers: shared dirtied=2 hit=1 read=5
                       ->  Index Scan using index_routes_on_source_type_and_source_id on public.routes  (cost=0.56..4.58 rows=1 width=25) (actual time=21.356..21.357 rows=1 loops=1)
                             Index Cond: (((routes.source_type)::text = 'Namespace'::text) AND (routes.source_id = projects.namespace_id))
                             Buffers: shared read=5
                 ->  Index Only Scan using index_issues_on_project_id_and_created_at_and_id_and_state on public.issues  (cost=0.56..79.85 rows=1216 width=16) (actual time=14.977..425.319 rows=713 loops=1)
                       Index Cond: ((issues.project_id = $PROJECT_ID) AND (issues.created_at >= '2019-07-08 10:37:29.41965+00'::timestamp with time zone))
                       Buffers: shared dirtied=95 hit=465 read=295
           ->  Index Scan using index_issue_metrics on public.issue_metrics  (cost=0.43..4.41 rows=1 width=20) (actual time=2.040..2.043 rows=1 loops=713)
                 Index Cond: (issue_metrics.issue_id = issues.id)
                 Filter: ((issue_metrics.first_added_to_board_at IS NOT NULL) OR (issue_metrics.first_associated_with_milestone_at IS NOT NULL))
                 Rows Removed by Filter: 0
                 Buffers: shared dirtied=112 hit=1720 read=1138
   CTE ordered_records
     ->  WindowAgg  (cost=15.53..17.12 rows=91 width=32) (actual time=1911.056..1911.269 rows=630 loops=1)
           Buffers: shared dirtied=209 hit=2191 read=1443
           InitPlan 2 (returns $3)
             ->  Aggregate  (cost=6.39..6.40 rows=1 width=8) (actual time=0.166..0.166 rows=1 loops=1)
                   ->  CTE Scan on cte_table_for_issue  (cost=0.00..6.17 rows=91 width=0) (actual time=0.006..0.107 rows=630 loops=1)
                         Filter: (cte_table_for_issue.issue >= '00:00:00'::interval)
                         Rows Removed by Filter: 0
           ->  Sort  (cost=9.13..9.35 rows=91 width=16) (actual time=1910.833..1910.875 rows=630 loops=1)
                 Sort Key: cte_table_for_issue_1.issue
                 Sort Method: quicksort  Memory: 54kB
                 Buffers: shared dirtied=209 hit=2191 read=1443
                 ->  CTE Scan on cte_table_for_issue cte_table_for_issue_1  (cost=0.00..6.17 rows=91 width=16) (actual time=61.728..1910.034 rows=630 loops=1)
                       Filter: (cte_table_for_issue_1.issue >= '00:00:00'::interval)
                       Rows Removed by Filter: 0
                       Buffers: shared dirtied=209 hit=2186 read=1443
   ->  CTE Scan on ordered_records  (cost=0.00..3.87 rows=10 width=16) (actual time=1911.384..1911.878 rows=2 loops=1)
         Filter: (((ordered_records.row_id)::numeric >= ((ordered_records.ct)::numeric / 2.0)) AND ((ordered_records.row_id)::numeric <= (((ordered_records.ct)::numeric / 2.0) + '1'::numeric)))
         Rows Removed by Filter: 628
         Buffers: shared dirtied=209 hit=2191 read=1443
 Planning time: 9.206 ms
 Execution time: 1.912 s
 Total Cost: 2829.88
 Buffers Hit: 2191
 Buffers Written: 0
 Buffers Read: 1443

New

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at")-"issues"."created_at")) AS median
FROM "issues"
INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER
JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
WHERE "issues"."project_id" = $PROJECT_ID
  AND "issues"."created_at" >= '2019-07-08 10:49:57.023743'
  AND COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at")-"issues"."created_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Plan:

 Aggregate  (cost=2792.58..2792.60 rows=1 width=8) (actual time=10.268..10.268 rows=1 loops=1)
   Buffers: shared hit=3624 read=1
   ->  Nested Loop  (cost=1.99..2791.33 rows=167 width=24) (actual time=4.304..9.836 rows=629 loops=1)
         Buffers: shared hit=3614 read=1
         ->  Nested Loop  (cost=1.56..101.02 rows=607 width=12) (actual time=4.269..5.286 rows=712 loops=1)
               Buffers: shared hit=760 read=1
               ->  Nested Loop  (cost=1.00..9.04 rows=1 width=4) (actual time=4.208..4.210 rows=1 loops=1)
                     Buffers: shared hit=9 read=1
                     ->  Index Scan using projects_pkey on public.projects  (cost=0.43..4.45 rows=1 width=8) (actual time=0.017..0.018 rows=1 loops=1)
                           Index Cond: (projects.id = $PROJECT_ID)
                           Buffers: shared hit=4
                     ->  Index Only Scan using index_routes_on_source_type_and_source_id on public.routes  (cost=0.56..4.58 rows=1 width=4) (actual time=4.186..4.187 rows=1 loops=1)
                           Index Cond: ((routes.source_type = 'Namespace'::text) AND (routes.source_id = projects.namespace_id))
                           Buffers: shared hit=5 read=1
               ->  Index Only Scan using index_issues_on_project_id_and_created_at_and_id_and_state on public.issues  (cost=0.56..79.83 rows=1215 width=16) (actual time=0.060..0.986 rows=712 loops=1)
                     Index Cond: ((issues.project_id = $PROJECT_ID) AND (issues.created_at >= '2019-07-08 10:49:57.023743+00'::timestamp with time zone))
                     Buffers: shared hit=751
         ->  Index Scan using index_issue_metrics on public.issue_metrics  (cost=0.43..4.42 rows=1 width=20) (actual time=0.006..0.006 rows=1 loops=712)
               Index Cond: (issue_metrics.issue_id = issues.id)
               Filter: (((COALESCE(issue_metrics.first_associated_with_milestone_at, issue_metrics.first_added_to_board_at))::timestamp with time zone - issues.created_at) > '00:00:00'::interval)
               Rows Removed by Filter: 0
               Buffers: shared hit=2854
 Planning time: 8.842 ms
 Execution time: 10.406 ms
 Total Cost: 2792.60
 Buffers Hit: 3624
 Buffers Written: 0
 Buffers Read: 1

Plan Stage

Old

WITH "cte_table_for_plan" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("issue_metrics"."first_mentioned_in_commit_at") - COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at") AS PLAN
   FROM "issues"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.425603'
     AND ("issue_metrics"."first_added_to_board_at" IS NOT NULL
          OR "issue_metrics"."first_associated_with_milestone_at" IS NOT NULL)
     AND "issue_metrics"."first_mentioned_in_commit_at" IS NOT NULL),
     "ordered_records" AS
  (SELECT "cte_table_for_plan"."plan" AS PLAN,
          row_number() OVER (
                             ORDER BY "cte_table_for_plan"."plan") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_plan"
      WHERE "cte_table_for_plan"."plan" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_plan"
   WHERE "cte_table_for_plan"."plan" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."plan")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "issue_metrics"."first_mentioned_in_commit_at" - COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at"))) AS median
FROM "issues"
INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
WHERE "issues"."project_id" = $PROJECT_ID
  AND "issues"."created_at" >= '2019-07-08 10:49:57.025673'
  AND ("issue_metrics"."first_added_to_board_at" IS NOT NULL
       OR "issue_metrics"."first_associated_with_milestone_at" IS NOT NULL)
  AND "issue_metrics"."first_mentioned_in_commit_at" IS NOT NULL
  AND "issue_metrics"."first_mentioned_in_commit_at" - COALESCE("issue_metrics"."first_associated_with_milestone_at", "issue_metrics"."first_added_to_board_at") > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Code Stage

Old Query

WITH "cte_table_for_code" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("merge_requests"."created_at") - COALESCE("issue_metrics"."first_mentioned_in_commit_at") AS code
   FROM "merge_requests_closing_issues"
   INNER JOIN "issues" ON "issues"."id" = "merge_requests_closing_issues"."issue_id"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   LEFT OUTER JOIN "merge_requests" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
   INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.427677'
     AND "merge_requests"."created_at" >= "issue_metrics"."first_mentioned_in_commit_at"),
     "ordered_records" AS
  (SELECT "cte_table_for_code"."code" AS code,
          row_number() OVER (
                             ORDER BY "cte_table_for_code"."code") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_code"
      WHERE "cte_table_for_code"."code" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_code"
   WHERE "cte_table_for_code"."code" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."code")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New Query

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "merge_requests"."created_at" - "issue_metrics"."first_mentioned_in_commit_at")) AS median
FROM "merge_requests"
INNER JOIN "projects" ON "merge_requests"."target_project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "merge_requests_closing_issues" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
INNER JOIN "issue_metrics" ON "merge_requests_closing_issues"."issue_id" = "issue_metrics"."issue_id"
WHERE "merge_requests"."target_project_id" = $PROJECT_ID
  AND "merge_requests"."created_at" >= '2019-07-08 10:49:57.027348'
  AND "merge_requests"."created_at" - "issue_metrics"."first_mentioned_in_commit_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Test Stage

Old Query

WITH "cte_table_for_test" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("merge_request_metrics"."latest_build_finished_at") - COALESCE("merge_request_metrics"."latest_build_started_at") AS test
   FROM "merge_requests_closing_issues"
   INNER JOIN "issues" ON "issues"."id" = "merge_requests_closing_issues"."issue_id"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   LEFT OUTER JOIN "merge_requests" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
   INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.431506'),
     "ordered_records" AS
  (SELECT "cte_table_for_test"."test" AS test,
          row_number() OVER (
                             ORDER BY "cte_table_for_test"."test") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_test"
      WHERE "cte_table_for_test"."test" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_test"
   WHERE "cte_table_for_test"."test" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."test")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New Query

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "merge_request_metrics"."latest_build_finished_at" - "merge_request_metrics"."latest_build_started_at")) AS median
FROM "merge_requests"
INNER JOIN "projects" ON "merge_requests"."target_project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_requests"."target_project_id" = $PROJECT_ID
  AND "merge_requests"."created_at" >= '2019-07-08 10:49:57.030658'
  AND "merge_request_metrics"."latest_build_finished_at" - "merge_request_metrics"."latest_build_started_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Review Stage

Old Query

WITH "cte_table_for_review" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("merge_request_metrics"."merged_at") - COALESCE("merge_requests"."created_at") AS review
   FROM "merge_requests_closing_issues"
   INNER JOIN "issues" ON "issues"."id" = "merge_requests_closing_issues"."issue_id"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   LEFT OUTER JOIN "merge_requests" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
   INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.433867'
     AND "merge_request_metrics"."merged_at" IS NOT NULL),
     "ordered_records" AS
  (SELECT "cte_table_for_review"."review" AS review,
          row_number() OVER (
                             ORDER BY "cte_table_for_review"."review") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_review"
      WHERE "cte_table_for_review"."review" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_review"
   WHERE "cte_table_for_review"."review" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."review")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New Query

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "merge_request_metrics"."merged_at" - "merge_requests"."created_at")) AS median
FROM "merge_requests"
INNER JOIN "projects" ON "merge_requests"."target_project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_requests"."target_project_id" = $PROJECT_ID
  AND "merge_requests"."created_at" >= '2019-07-08 10:49:57.032044'
  AND "merge_request_metrics"."merged_at" - "merge_requests"."created_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Staging Stage

Old Query

WITH "cte_table_for_staging" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("merge_request_metrics"."first_deployed_to_production_at") - COALESCE("merge_request_metrics"."merged_at") AS staging
   FROM "merge_requests_closing_issues"
   INNER JOIN "issues" ON "issues"."id" = "merge_requests_closing_issues"."issue_id"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   LEFT OUTER JOIN "merge_requests" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
   INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.436250'
     AND "merge_request_metrics"."first_deployed_to_production_at" >= '2019-07-08 10:37:29.436250'),
     "ordered_records" AS
  (SELECT "cte_table_for_staging"."staging" AS staging,
          row_number() OVER (
                             ORDER BY "cte_table_for_staging"."staging") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_staging"
      WHERE "cte_table_for_staging"."staging" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_staging"
   WHERE "cte_table_for_staging"."staging" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."staging")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New Query

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "merge_request_metrics"."first_deployed_to_production_at" - "merge_request_metrics"."merged_at")) AS median
FROM "merge_requests"
INNER JOIN "projects" ON "merge_requests"."target_project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_requests"."target_project_id" = $PROJECT_ID
  AND "merge_requests"."created_at" >= '2019-07-08 10:49:57.033508'
  AND "merge_request_metrics"."first_deployed_to_production_at" - "merge_request_metrics"."merged_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Production Stage

Old Query

WITH "cte_table_for_production" AS
  (SELECT "issues"."project_id" AS project_id,
          "projects"."path" AS project_path,
          "routes"."path" AS namespace_path,
          COALESCE("merge_request_metrics"."first_deployed_to_production_at") - COALESCE("issues"."created_at") AS production
   FROM "merge_requests_closing_issues"
   INNER JOIN "issues" ON "issues"."id" = "merge_requests_closing_issues"."issue_id"
   INNER JOIN "issue_metrics" ON "issues"."id" = "issue_metrics"."issue_id"
   INNER JOIN "projects" ON "issues"."project_id" = "projects"."id"
   INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
   LEFT OUTER JOIN "merge_requests" ON "merge_requests"."id" = "merge_requests_closing_issues"."merge_request_id"
   INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
   WHERE "issues"."project_id" IN ($PROJECT_ID)
     AND "routes"."source_type" = 'Namespace'
     AND "issues"."created_at" >= '2019-07-08 10:37:29.439859'
     AND "merge_request_metrics"."first_deployed_to_production_at" >= '2019-07-08 10:37:29.439859'),
     "ordered_records" AS
  (SELECT "cte_table_for_production"."production" AS production,
          row_number() OVER (
                             ORDER BY "cte_table_for_production"."production") AS row_id,

     (SELECT COUNT(1)
      FROM "cte_table_for_production"
      WHERE "cte_table_for_production"."production" >= CAST('0' AS INTERVAL)) ct
   FROM "cte_table_for_production"
   WHERE "cte_table_for_production"."production" >= CAST('0' AS INTERVAL))
SELECT AVG(EXTRACT(EPOCH
                   FROM "ordered_records"."production")) AS median
FROM "ordered_records"
WHERE "ordered_records"."row_id" BETWEEN "ordered_records"."ct" / 2.0 AND ("ordered_records"."ct" / 2.0 + 1)

New Query

SELECT EXTRACT(EPOCH
               FROM percentile_cont(0.5) WITHIN GROUP(
                                                      ORDER BY "merge_request_metrics"."first_deployed_to_production_at" - "merge_request_metrics"."merged_at")) AS median
FROM "merge_requests"
INNER JOIN "projects" ON "merge_requests"."target_project_id" = "projects"."id"
INNER JOIN "routes" ON "projects"."namespace_id" = "routes"."source_id"
INNER JOIN "merge_request_metrics" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_requests"."target_project_id" = $PROJECT_ID
  AND "merge_requests"."created_at" >= '2019-07-08 10:49:57.035062'
  AND "merge_request_metrics"."first_deployed_to_production_at" - "merge_request_metrics"."merged_at" > CAST('0' AS INTERVAL)
  AND "routes"."source_type" = 'Namespace'

Old Implementation

This MR won't break old functionality. For the current Cycle Analytics page, the old code will be used and later on when the new frontend is ready, the endpoints will be switched to the new implementation.

What does this MR do?

Does this MR meet the acceptance criteria?

Conformity

Performance and testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Adam Hegyi

Merge request reports