Performance Insigths -Query Review- :: Week 35 // Query 3
SELECT users.* FROM users WHERE users.unconfirmed_email = ? ORDER BY users.id ASC LIMIT 1
Query plan:
gitlabhq_production=# explain analyze SELECT users.* FROM users WHERE users.unconfirmed_email = ? order by users.id ASC LIMIT 1;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.43..481827.85 rows=1 width=1086) (actual time=1366.567..1366.568 rows=1 loops=1)
-> Index Scan using users_pkey on users (cost=0.43..481827.85 rows=1 width=1086) (actual time=1366.567..1366.567 rows=1 loops=1)
Filter: ((unconfirmed_email)::text = 'xxxx'::text)
Rows Removed by Filter: 955018
Planning time: 0.352 ms
Execution time: 1366.619 ms
(6 rows)
The unconfirmed_email
column have NULL values 99% of the time:
-[ RECORD 1 ]----------------------------
schemaname | public
tablename | users
attname | unconfirmed_email
inherited | f
null_frac | 0.99643
avg_width | 22
n_distinct | -0.00357002
most_common_vals |
most_common_freqs |
A partial index could take advantage of this:
CREATE INDEX users_partial_on_unconfirmed_email on users(id) WHERE unconfirmed_email is not null;
Another way to make it perform better is to remove the ORDER BY users.id ASC
clause (probably not needed). That would make the planner to prefer using a seq scan. The main advantage is that the query will end when the first row that fullfills the filter condition:
gitlabhq_production=# explain analyze SELECT users.* FROM users WHERE users.unconfirmed_email = 'single-ci@singlemusic.co' LIMIT 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1036467.60 rows=1 width=1088) (actual time=668.637..668.639 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..1036467.60 rows=1 width=1088) (actual time=668.636..668.636 rows=1 loops=1)
Filter: ((unconfirmed_email)::text = 'xxxx'::text)
Rows Removed by Filter: 2594870
Planning time: 0.323 ms
Execution time: 668.692 ms
(6 rows)