Skip to content

Optimize query to improve performance of contribution analytics

What does this MR do and why?

This MR addresses #362724 (closed), in particular it tweaks the existing query and adds index following @ahegyi proposal, as we believe it's the best we can do to better the query performance.

For Database review

db:migrate
main: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: migrating
main: -- transaction_open?()
main:    -> 0.0000s
main: -- index_exists?(:events, [:project_id, :target_type, :action, :created_at, :author_id, :id], {:name=>"index_on_events_to_improve_contribution_analytics_performance", :algorithm=>:concurrently})
main:    -> 0.0116s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- add_index(:events, [:project_id, :target_type, :action, :created_at, :author_id, :id], {:name=>"index_on_events_to_improve_contribution_analytics_performance", :algorithm=>:concurrently})
main:    -> 0.0094s
main: -- execute("RESET statement_timeout")
main:    -> 0.0003s
main: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: migrated (0.0321s)

ci: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: migrating
ci: -- transaction_open?()
ci:    -> 0.0000s
ci: -- index_exists?(:events, [:project_id, :target_type, :action, :created_at, :author_id, :id], {:name=>"index_on_events_to_improve_contribution_analytics_performance", :algorithm=>:concurrently})
ci:    -> 0.0095s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0004s
ci: -- add_index(:events, [:project_id, :target_type, :action, :created_at, :author_id, :id], {:name=>"index_on_events_to_improve_contribution_analytics_performance", :algorithm=>:concurrently})
ci:    -> 0.0096s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0004s
ci: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: migrated (0.0251s)
db:rollback:main
main: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: reverting
main: -- transaction_open?()
main:    -> 0.0000s
main: -- indexes(:events)
main:    -> 0.0147s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0004s
main: -- remove_index(:events, {:algorithm=>:concurrently, :name=>"index_on_events_to_improve_contribution_analytics_performance"})
main:    -> 0.0037s
main: -- execute("RESET statement_timeout")
main:    -> 0.0004s
main: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: reverted (0.0282s)
db:rollback:ci
ci: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: reverting
ci: -- transaction_open?()
ci:    -> 0.0000s
ci: -- indexes(:events)
ci:    -> 0.0130s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0003s
ci: -- remove_index(:events, {:algorithm=>:concurrently, :name=>"index_on_events_to_improve_contribution_analytics_performance"})
ci:    -> 0.0053s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0003s
ci: == 20220630091409 AddIndexOnEventsForContributionAnalyticsOptimization: reverted (0.0270s)
Previous version of changed query
SELECT "events".*
FROM
  "events"
  INNER JOIN "projects" ON "projects"."id" = "events"."project_id"
  INNER JOIN routes rs ON rs.source_id = projects.id AND rs.source_type = 'Project'
WHERE
  ("events"."action" = 5 OR "events"."target_type" IN ('MergeRequest', 'Issue') AND "events"."action" IN (1, 3, 7, 12))
  AND "events"."created_at" >= '2022-06-23'
  AND (rs.path LIKE 'gitlab-org/%');

Query plan: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10821/commands/38931

Index was executed in #database-lab. Duration: 298.032 min
New version of changed query
WITH "project_ids" AS MATERIALIZED (
  SELECT source_id AS id FROM "routes" WHERE "routes"."source_type" = 'Project' AND "routes"."path" LIKE 'gitlab-org/%'
)
SELECT "events".*
FROM (
  (SELECT "events".* FROM "events" WHERE (project_id IN (SELECT id FROM project_ids)) AND "events"."created_at" >= '2022-06-23' AND "events"."action" = 5 AND "events"."target_type" IS NULL)
  UNION ALL
  (SELECT "events".* FROM "events" WHERE (project_id IN (SELECT id FROM project_ids)) AND "events"."created_at" >= '2022-06-23' AND "events"."action" IN (1, 3, 7, 12) AND "events"."target_type" IN ('MergeRequest', 'Issue'))
) events;

Query plan: https://console.postgres.ai/gitlab/gitlab-production-tunnel-pg12/sessions/10821/commands/38954

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #362724 (closed)

Edited by Adam Hegyi

Merge request reports