push_rules table and sequential scan for single 'is_sample = true' row
A new slow query has come up:
SELECT "push_rules".* FROM "push_rules" WHERE "push_rules"."is_sample" = 't' LIMIT 1;
This query ends up returning only 1 row as there's only one row where 'is_sample' is true today:
=> select is_sample,count(*) from push_rules group by is_sample
-> ;
is_sample | count
-----------+---------
f | 2754943
t | 1
(2 rows)
However, there's no index on is_sample, resulting in a (typically lengthy) sequential scan:
=> explain (buffers,analyze) SELECT "push_rules".* FROM "push_rules" WHERE "push_rules"."is_sample" = 't' LIMIT 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..57354.57 rows=1 width=1073) (actual time=365.042..365.043 rows=1 loops=1)
Buffers: shared hit=27252
-> Seq Scan on push_rules (cost=0.00..57354.57 rows=1 width=1073) (actual time=365.040..365.040 rows=1 loops=1)
Filter: is_sample
Rows Removed by Filter: 2700066
Buffers: shared hit=27252
Planning time: 0.051 ms
Execution time: 365.091 ms
(8 rows)
This will vary depending on exactly where the scan is started from, of course, but the min time today for this query was 300ms, with the average being 386ms, implying that the table has to be scanned nearly completely every time (as shown in the above explain (analyze, buffers), whereby 27252 pages, or 212MB out of 234MB, had to be scanned).
This is now the top table in the sequential-reads report on Grafana:
http://monitor.gitlab.net/dashboard/db/postgres-queries?refresh=1m&orgId=1&from=now-24h&to=now
Worse, this query is very likely to just get worse and worse over time as this table grows.
At a minimum, an index should be created across the 'is_sample' column. Ideally, a partial index would be created, eg:
CREATE INDEX ON push_rules (is_sample) WHERE is_sample;
This would result in a very small index which would be able to answer this query very quickly.
Other alternatives are to cache this one "sample" row somewhere in the system, or perhaps move it into a dedicated table, but the simplest from the database perspective would be the partial index.