Skip to content

Optimize ci_builds counters in usage data

Optimize ci_builds queries using an index

  • Note also that I avoid Filter: ((ci_builds.type)::text = 'Ci::Build'::text) a very specialized index.
  • I hope to keep the index a bit more general and see if it works for other counters in the main issue in 12.10

Main issue #208884 (closed)

Optimization

CREATE INDEX index_ci_builds_on_user_id_and_created_at_and_type_eq_ci_build ON public.ci_builds USING btree (user_id, created_at) WHERE ((type)::text = 'Ci::Build'::text);

# Timing The query has been executed. Duration: 81.894 min

Query 1

SELECT MIN("ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."created_at" BETWEEN '2020-02-24 07:25:33.436925' AND '2020-03-23 07:25:33.436996'

Before: https://explain.depesz.com/s/YUOp

After: https://explain.depesz.com/s/kQtk

Query 2

SELECT MAX("ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."created_at" BETWEEN '2020-02-24 07:25:33.436925' AND '2020-03-23 07:25:33.436996'

Before: https://explain.depesz.com/s/qHXF

After: https://explain.depesz.com/s/Lm5F

Query 3

SELECT COUNT(DISTINCT "ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."created_at" BETWEEN '2020-02-24 07:25:33.436925' AND '2020-03-23 07:25:33.436996' AND "ci_builds"."user_id" BETWEEN 4409000 AND 4410250

# ((ci_builds.created_at >= '2020-02-24 07:25:33.436925'::timestamp without time zone) AND (ci_builds.created_at <= '2020-03-23 07:25:33.436996'::timestamp without time zone) AND ((ci_builds.type)::text = 'Ci::Build'::text))

Before: https://explain.depesz.com/s/jCyJ (4409000 AND 4410250) => Count result is 20 distinct users over 1 minute

After: https://explain.depesz.com/s/yALe < 10 ms on average

Other queries that helps

Non time bound version of the ci_build

SELECT MIN("ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build'
SELECT MAX("ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build'
SELECT COUNT(DISTINCT "ci_builds"."user_id") FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."user_id" BETWEEN 1 AND 1250

Count timing

  • 5.5 million users / 1250 => 4400 loops with < 100 msecs => 440 seconds max, and 44 seconds expected

Migration Output

$ VERBOSE=true bundle exec rake db:migrate:up VERSION=20200323122201
== 20200323122201 AddIndexOnUserAndCreatedAtToCiBuilds: migrating =============
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:ci_builds, [:user_id, :created_at], {:where=>"type = 'Ci::Build'", :name=>"index_ci_builds_on_user_id_and_created_at_and_type_eq_ci_build", :algorithm=>:concurrently})
   -> 0.0070s
-- execute("SET statement_timeout TO 0")
   -> 0.0001s
-- add_index(:ci_builds, [:user_id, :created_at], {:where=>"type = 'Ci::Build'", :name=>"index_ci_builds_on_user_id_and_created_at_and_type_eq_ci_build", :algorithm=>:concurrently})
   -> 0.0202s
-- execute("RESET ALL")
   -> 0.0005s
== 20200323122201 AddIndexOnUserAndCreatedAtToCiBuilds: migrated (0.0282s) ====

$ VERBOSE=true bundle exec rake db:migrate:down VERSION=20200323122201
== 20200323122201 AddIndexOnUserAndCreatedAtToCiBuilds: reverting =============
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:ci_builds, "index_ci_builds_on_user_id_and_created_at_and_type_eq_ci_build", {:algorithm=>:concurrently})
   -> 0.0081s
== 20200323122201 AddIndexOnUserAndCreatedAtToCiBuilds: reverted (0.0083s) ====
Edited by Alper Akgun

Merge request reports