Optimize todos counters in usage data
What does this MR do?
Optimizes the todo
counter query for batch counting in usage data
Issue #208891 (closed)
Query
explain SELECT COUNT(DISTINCT "todos"."author_id") FROM "todos" WHERE "todos"."created_at" BETWEEN '2020-02-07 06:22:09.650617' AND '2020-03-06 06:22:09.650886' AND "todos"."author_id" BETWEEN 610000 AND 612000
Optimization
CREATE INDEX index_todos_on_author_id_and_created_at ON public.todos USING btree (author_id, created_at);
# The query has been executed. Duration: 6.503 min (edited)
https://explain.depesz.com/s/CCpc
Before: Bad!https://explain.depesz.com/s/QwPV
After: Good:Timing
After the index for batch counting takes 55 seconds pessimistic
- 5.5 million users,
- with 10K batch sizes
- 5.5M/10K = 550 loops
- Time: < 100 ms ( cold cache )
Migration output
$ VERBOSE=true bundle exec rake db:migrate:up VERSION=20200306170531
== 20200306170531 AddIndexOnAuthorIdAndCreatedAtToTodos: migrating ============
-- transaction_open?()
-> 0.0000s
-- index_exists?(:todos, [:author_id, :created_at], {:algorithm=>:concurrently})
-> 0.0037s
-- execute("SET statement_timeout TO 0")
-> 0.0003s
-- add_index(:todos, [:author_id, :created_at], {:algorithm=>:concurrently})
-> 0.0196s
-- execute("RESET ALL")
-> 0.0012s
== 20200306170531 AddIndexOnAuthorIdAndCreatedAtToTodos: migrated (0.0251s) ===
$ VERBOSE=true bundle exec rake db:migrate:down VERSION=20200306170531
== 20200306170531 AddIndexOnAuthorIdAndCreatedAtToTodos: reverting ============
-- transaction_open?()
-> 0.0000s
-- index_exists?(:todos, [:author_id, :created_at], {:algorithm=>:concurrently})
-> 0.0038s
-- execute("SET statement_timeout TO 0")
-> 0.0003s
-- remove_index(:todos, {:algorithm=>:concurrently, :column=>[:author_id, :created_at]})
-> 0.0259s
-- execute("RESET ALL")
-> 0.0012s
== 20200306170531 AddIndexOnAuthorIdAndCreatedAtToTodos: reverted (0.0314s) ===
Edited by Alper Akgun