Skip to content

Add clusters index to improve usage data queries

Tiger Watson requested to merge add-missing-cluster-usage-data-index into master

What does this MR do?

Adds a missing index to clusters to improve performance of usage ping queries:

Example query from https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/ee/gitlab/usage_data.rb#L211-214:

SELECT 
  COUNT(DISTINCT "clusters"."user_id") 
FROM 
  "clusters" 
WHERE 
  "clusters"."cluster_type" = 3 
  AND "clusters"."enabled" = TRUE 
  AND "clusters"."created_at" BETWEEN '2020-03-20 04:23:25.724123' AND '2020-04-20 04:23:25.724204'
  AND "clusters"."id" BETWEEN 40000 AND 49999;

Before:

 Aggregate  (cost=1390.10..1390.11 rows=1 width=8) (actual time=1.301..1.302 rows=1 loops=1)
   Buffers: shared hit=245
   ->  Index Scan using clusters_pkey on public.clusters  (cost=0.29..1389.71 rows=157 width=4) (actual time=1.285..1.285 rows=0 loops=1)
         Index Cond: ((clusters.id >= 40000) AND (clusters.id <= 49999))
         Filter: (clusters.enabled AND (clusters.created_at >= '2020-03-20 04:23:25.724123+00'::timestamp with time zone) AND (clusters.created_at <= '2020-04-20 04:23:25.724204+00'::timestamp with time zone) AND (clusters.cluster_type = 3))
         Rows Removed by Filter: 3097
         Buffers: shared hit=245

After:

 Aggregate  (cost=232.13..232.14 rows=1 width=8) (actual time=0.480..0.480 rows=1 loops=1)
   Buffers: shared hit=16 read=1
   I/O Timings: read=0.120
   ->  Index Scan using index_clusters_on_enabled_cluster_type_id_and_created_at on public.clusters  (cost=0.29..231.73 rows=157 width=4) (actual time=0.259..0.259 rows=0 loops=1)
         Index Cond: ((clusters.enabled = true) AND (clusters.cluster_type = 3) AND (clusters.id >= 40000) AND (clusters.id <= 49999) AND (clusters.created_at >= '2020-03-20 04:23:25.724123+00'::timestamp with time zone) AND (clusters.created_at <= '2020-04-20 04:23:25.724204+00'::timestamp with time zone))
         Filter: clusters.enabled
         Rows Removed by Filter: 0
         Buffers: shared hit=12

Example query from https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb#L68-75

SELECT
  COUNT("clusters"."id")
FROM
  "clusters"
WHERE
  "clusters"."enabled" = TRUE
  AND "clusters"."cluster_type" = 3
  AND "clusters"."id" BETWEEN 7 AND 100006

Before:

 Aggregate  (cost=2805.97..2805.98 rows=1 width=8) (actual time=23.369..23.369 rows=1 loops=1)
   Buffers: shared hit=11701
   ->  Index Scan using clusters_pkey on public.clusters  (cost=0.29..2733.72 rows=28900 width=4) (actual time=0.012..20.038 rows=28844 loops=1)
         Index Cond: ((clusters.id >= 7) AND (clusters.id <= 100006))
         Filter: (clusters.enabled AND (clusters.cluster_type = 3))
         Rows Removed by Filter: 7031
         Buffers: shared hit=11701

After:

 Aggregate  (cost=1205.64..1205.65 rows=1 width=8) (actual time=12.243..12.243 rows=1 loops=1)
   Buffers: shared hit=2451 read=113
   I/O Timings: read=2.295
   ->  Index Only Scan using index_clusters_on_enabled_cluster_type_id_and_created_at on public.clusters  (cost=0.29..1133.37 rows=28908 width=4) (actual time=0.164..9.556 rows=28844 loops=1)
         Index Cond: ((clusters.enabled = true) AND (clusters.cluster_type = 3) AND (clusters.id >= 7) AND (clusters.id <= 100006))
         Heap Fetches: 3265
         Filter: clusters.enabled
         Rows Removed by Filter: 0
         Buffers: shared hit=2451 read=113
         I/O Timings: read=2.295

Migrate up:

== 20200402001106 AddClusterTypeIndexToClusters: migrating ====================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:clusters, [:enabled, :cluster_type, :id, :created_at], {:name=>"index_clusters_on_enabled_cluster_type_id_and_created_at", :algorithm=>:concurrently})
   -> 0.0044s
-- execute("SET statement_timeout TO 0")
   -> 0.0002s
-- add_index(:clusters, [:enabled, :cluster_type, :id, :created_at], {:name=>"index_clusters_on_enabled_cluster_type_id_and_created_at", :algorithm=>:concurrently})
   -> 0.0066s
-- execute("RESET ALL")
   -> 0.0002s
== 20200402001106 AddClusterTypeIndexToClusters: migrated (0.0115s) ===========

Migrate down:

== 20200402001106 AddClusterTypeIndexToClusters: reverting ====================
-- transaction_open?()
   -> 0.0000s
-- indexes(:clusters)
   -> 0.0039s
-- execute("SET statement_timeout TO 0")
   -> 0.0002s
-- remove_index(:clusters, {:algorithm=>:concurrently, :name=>"index_clusters_on_enabled_cluster_type_id_and_created_at"})
   -> 0.0028s
-- execute("RESET ALL")
   -> 0.0002s
== 20200402001106 AddClusterTypeIndexToClusters: reverted (0.0073s) ===========

Screenshots

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team
Edited by 🤖 GitLab Bot 🤖

Merge request reports