Performance Insigths -Query Review- :: Week 35 // Query 1 - create partial index

SELECT namespaces.* 
FROM   namespaces 
WHERE  namespaces.type IN ( 'Group' ) 
       AND namespaces.visibility_level = 20 
ORDER  BY namespaces.name ASC, 
          namespaces.id ASC 
LIMIT  101 offset 143800;

Current plan:

                                                                            QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=173041.31..173041.57 rows=101 width=314) (actual time=1984.124..1984.170 rows=101 loops=1)
   ->  Sort  (cost=172681.81..174798.77 rows=846784 width=314) (actual time=1926.213..1978.310 rows=143901 loops=1)
         Sort Key: name, id
         Sort Method: quicksort  Memory: 48242kB
         ->  Index Scan using index_namespaces_on_type on namespaces  (cost=0.43..95900.87 rows=846784 width=314) (actual time=0.107..1387.788 rows=157932 loops=1)
               Index Cond: ((type)::text = 'Group'::text)
               Filter: (visibility_level = 20)
               Rows Removed by Filter: 854911
 Planning time: 0.233 ms
 Execution time: 1988.589 ms
(10 rows)

This query can be improved by creating a new partial index:

The namespaces.type column have ~81% of NULL values (look at null_frac column):

gitlabhq_production=# select * from pg_stats where tablename = 'namespaces' and attname='type';  
-[ RECORD 1 ]----------+-----------
schemaname             | public
tablename              | namespaces
attname                | type
inherited              | f
null_frac              | 0.81121
avg_width              | 6
n_distinct             | 1
most_common_vals       | {Group}
most_common_freqs      | {0.18879}
histogram_bounds       | 
correlation            | 1
most_common_elems      | 
most_common_elem_freqs | 
elem_count_histogram   |

Here, a partial index over the type column without NULL values should be preferred. Something like CREATE INDEX CONCURRENTLY index_namespaces_partial_on_type ON namespaces(type) WHERE type is not NULL

Edited Sep 09, 2019 by Craig Gomes