Skip to content

Improve query performance of merged_at filter

Adam Hegyi requested to merge use-target-project-id-in-mr-metrics into master

What does this MR do?

This MR significantly improve the query performance of productivity analytics and throughput analytics features. Previously we added target_project_id to merge_request_metrics (denormalization) and the BG migration (!37713 (merged)) that backfills the data is close to finish.

The change is happening in the MergeRequestsFinder: If we join metrics, add an extra condition: merge_requests.target_project_id = merge_request_metrics.target_project_id

Migration (index)

exec create index i1 on merge_request_metrics (target_project_id, merged_at)
Session: webui-i769
The query has been executed. Duration: 2.697 min

Up:

== 20200812112204 AddIndexToMrMetricsTargetProjectId: migrating ===============
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:merge_request_metrics, [:target_project_id, :merged_at], {:name=>"index_merge_request_metrics_on_target_project_id_merged_at", :algorithm=>:concurrently})
   -> 0.0054s
-- add_index(:merge_request_metrics, [:target_project_id, :merged_at], {:name=>"index_merge_request_metrics_on_target_project_id_merged_at", :algorithm=>:concurrently})
   -> 0.0178s
== 20200812112204 AddIndexToMrMetricsTargetProjectId: migrated (0.0238s) ======

Down:

== 20200812112204 AddIndexToMrMetricsTargetProjectId: reverting ===============
-- transaction_open?()
   -> 0.0000s
-- indexes(:merge_request_metrics)
   -> 0.0081s
-- remove_index(:merge_request_metrics, {:algorithm=>:concurrently, :name=>"index_merge_request_metrics_on_target_project_id_merged_at"})
   -> 0.0021s
== 20200812112204 AddIndexToMrMetricsTargetProjectId: reverted (0.0107s) ======

Queries

MR (throughput) analytics

Before:

SELECT COUNT(*) 
FROM "merge_requests" 
INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id" 
WHERE "merge_requests"."target_project_id" = 278964 
AND "merge_request_metrics"."merged_at" >= '2019-06-30 23:00:00' 
AND "merge_request_metrics"."merged_at" <= '2019-07-30 23:00:00'

Plan

After:

SELECT COUNT(*) 
FROM "merge_requests" 
INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id" 
WHERE "merge_requests"."target_project_id" = 278964 
AND merge_request_metrics.target_project_id = merge_requests.target_project_id
AND "merge_request_metrics"."merged_at" >= '2019-06-30 23:00:00' 
AND "merge_request_metrics"."merged_at" <= '2019-07-30 23:00:00'

Plan

Productivity Analytics (group level)

Before:


SELECT DATE_PART('day', merge_request_metrics.first_comment_at - merge_requests.created_at)*24+DATE_PART('hour', merge_request_metrics.first_comment_at - merge_requests.created_at) AS metric,
       count(*) AS mr_count
FROM "merge_request_metrics"
INNER JOIN "merge_requests" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_request_metrics"."merge_request_id" IN
    (SELECT "merge_requests"."id"
     FROM "merge_requests"
     INNER JOIN "projects" ON "projects"."id" = "merge_requests"."target_project_id"
     INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id"
     LEFT JOIN project_features ON projects.id = project_features.project_id
     WHERE "projects"."namespace_id" IN
         (WITH RECURSIVE "base_and_descendants" AS (
                                                      (SELECT "namespaces".*
                                                       FROM "namespaces"
                                                       WHERE "namespaces"."type" = 'Group'
                                                         AND "namespaces"."id" = 9970)
                                                    UNION
                                                      (SELECT "namespaces".*
                                                       FROM "namespaces",
                                                            "base_and_descendants"
                                                       WHERE "namespaces"."type" = 'Group'
                                                         AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id"
          FROM "base_and_descendants" AS "namespaces")
       AND (EXISTS
              (SELECT 1
               FROM "project_authorizations"
               WHERE "project_authorizations"."user_id" = 2
                 AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (0,
                                             10,
                                             20))
       AND ("project_features"."merge_requests_access_level" > 0
            OR "project_features"."merge_requests_access_level" IS NULL)
       AND ("merge_requests"."state_id" IN (3))
       AND "merge_request_metrics"."merged_at" >= '2020-07-17 14:40:20.017161'
       AND "merge_request_metrics"."merged_at" <= '2020-08-13 23:59:59'
     ORDER BY "merge_requests"."updated_at" DESC, "merge_requests"."id" DESC)
GROUP BY DATE_PART('day', merge_request_metrics.first_comment_at - merge_requests.created_at)*24+DATE_PART('hour', merge_request_metrics.first_comment_at - merge_requests.created_at)

Plan

Note: the uncached query times out on PRD.

After:


SELECT DATE_PART('day', merge_request_metrics.first_comment_at - merge_requests.created_at)*24+DATE_PART('hour', merge_request_metrics.first_comment_at - merge_requests.created_at) AS metric,
       count(*) AS mr_count
FROM "merge_request_metrics"
INNER JOIN "merge_requests" ON "merge_requests"."id" = "merge_request_metrics"."merge_request_id"
WHERE "merge_request_metrics"."merge_request_id" IN
    (SELECT "merge_requests"."id"
     FROM "merge_requests"
     INNER JOIN "projects" ON "projects"."id" = "merge_requests"."target_project_id"
     INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id"
     LEFT JOIN project_features ON projects.id = project_features.project_id
     WHERE "projects"."namespace_id" IN
         (WITH RECURSIVE "base_and_descendants" AS (
                                                      (SELECT "namespaces".*
                                                       FROM "namespaces"
                                                       WHERE "namespaces"."type" = 'Group'
                                                         AND "namespaces"."id" = 9970)
                                                    UNION
                                                      (SELECT "namespaces".*
                                                       FROM "namespaces",
                                                            "base_and_descendants"
                                                       WHERE "namespaces"."type" = 'Group'
                                                         AND "namespaces"."parent_id" = "base_and_descendants"."id")) SELECT "namespaces"."id"
          FROM "base_and_descendants" AS "namespaces")
       AND (EXISTS
              (SELECT 1
               FROM "project_authorizations"
               WHERE "project_authorizations"."user_id" = 2
                 AND (project_authorizations.project_id = projects.id))
            OR projects.visibility_level IN (0,
                                             10,
                                             20))
       AND ("project_features"."merge_requests_access_level" > 0
            OR "project_features"."merge_requests_access_level" IS NULL)
       AND ("merge_requests"."state_id" IN (3))
       AND "merge_request_metrics"."merged_at" >= '2020-07-17 14:40:20.017161'
       AND "merge_request_metrics"."merged_at" <= '2020-08-13 23:59:59'
       AND "merge_requests"."target_project_id" = "merge_request_metrics"."target_project_id"
     ORDER BY "merge_requests"."updated_at" DESC, "merge_requests"."id" DESC)
GROUP BY DATE_PART('day', merge_request_metrics.first_comment_at - merge_requests.created_at)*24+DATE_PART('hour', merge_request_metrics.first_comment_at - merge_requests.created_at)

Plan

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Adam Hegyi

Merge request reports