Performance Insights -Query Review- :: Week 36 // Query 2
Query
WITH RECURSIVE base_and_descendants AS (
SELECT
namespaces.*
FROM
namespaces
WHERE
namespaces.type IN ('Group')
AND (EXISTS (
SELECT
1
FROM
plans
INNER JOIN gitlab_subscriptions ON gitlab_subscriptions.hosted_plan_id = plans.id
WHERE
plans.name IN('silver', 'gold')
AND (gitlab_subscriptions.namespace_id = namespaces.id))
)
UNION
SELECT
namespaces.*
FROM
namespaces,
base_and_descendants
WHERE
namespaces.type IN ('Group')
AND namespaces.parent_id = base_and_descendants.id
)
SELECT
namespaces.*
FROM
base_and_descendants AS namespaces
WHERE
namespaces.id = 5995481
LIMIT 1
Plan:
Limit (cost=7828857.54..7828862.04 rows=1 width=2899) (actual time=7181.675..7181.675 rows=0 loops=1)
CTE base_and_descendants
-> Recursive Union (cost=98204.96..7828857.54 rows=35158758 width=314) (actual time=25.379..7157.624 rows=18799 loops=1)
-> Hash Semi Join (cost=98204.96..199628.77 rows=273288 width=314) (actual time=25.345..2099.840 rows=2014 loops=1)
Hash Cond: (namespaces_1.id = gitlab_subscriptions.namespace_id)
-> Index Scan using index_namespaces_on_type on namespaces namespaces_1 (cost=0.43..95701.85 rows=1021738 width=314) (actual time=0.050..1796.117 rows=1023274 loops=1)
Index Cond: ((type)::text = 'Group'::text)
-> Hash (cost=80184.27..80184.27 rows=1441621 width=4) (actual time=7.962..7.962 rows=3304 loops=1)
Buckets: 2097152 Batches: 1 Memory Usage: 16501kB
-> Nested Loop (cost=0.43..80184.27 rows=1441621 width=4) (actual time=0.067..6.385 rows=3305 loops=1)
-> Seq Scan on plans (cost=0.00..4.06 rows=2 width=4) (actual time=0.023..0.028 rows=2 loops=1)
Filter: ((name)::text = ANY ('{silver,gold}'::text[]))
Rows Removed by Filter: 3
-> Index Scan using index_gitlab_subscriptions_on_hosted_plan_id on gitlab_subscriptions (cost=0.43..31080.00 rows=901010 width=8) (actual time=0.029..2.611 rows=1652 loops=2)
Index Cond: (hosted_plan_id = plans.id)
-> Merge Join (cost=346830.11..692605.36 rows=3488547 width=314) (actual time=273.911..715.197 rows=2399 loops=7)
Merge Cond: (namespaces_2.parent_id = base_and_descendants.id)
-> Index Scan using index_namespaces_on_parent_id_and_id on namespaces namespaces_2 (cost=0.43..290893.14 rows=1021738 width=314) (actual time=0.015..692.531 rows=263012 loops=7)
Filter: ((type)::text = 'Group'::text)
Rows Removed by Filter: 309
-> Sort (cost=346829.67..353661.87 rows=2732880 width=4) (actual time=1.339..2.315 rows=4675 loops=7)
Sort Key: base_and_descendants.id
Sort Method: quicksort Memory: 25kB
-> WorkTable Scan on base_and_descendants (cost=0.00..54657.60 rows=2732880 width=4) (actual time=0.002..0.659 rows=2686 loops=7)
-> CTE Scan on base_and_descendants namespaces (cost=0.00..791072.05 rows=175794 width=2899) (actual time=7181.673..7181.673 rows=0 loops=1)
Filter: (id = 5995481)
Rows Removed by Filter: 18799
Planning time: 1.309 ms
Execution time: 7302.735 ms
(29 rows)
Statistics
Observation range | [2019-09-04 05:53 - 2019-09-05 06:25 ] |
Frequency (calls per minute) | 4.1 |
Average execution time (ms) | 5435 |
Total database time during range (s) | ~32947 |
Analysis
One issue with this plan, is the fact that the filter by namespaces.id
applies after the CTE construction. We should filter in earlier stages, at CTE definition:
WITH RECURSIVE base_and_descendants AS (
SELECT
namespaces.*
FROM
namespaces
WHERE
namespaces.type IN ('Group')
AND (EXISTS (
SELECT
1
FROM
plans
INNER JOIN gitlab_subscriptions ON gitlab_subscriptions.hosted_plan_id = plans.id
WHERE
plans.name IN('silver', 'gold')
AND (gitlab_subscriptions.namespace_id = namespaces.id))
AND namespaces.namespace_id=5995481
)
UNION
SELECT
namespaces.*
FROM
namespaces,
base_and_descendants
WHERE
namespaces.type IN ('Group')
AND namespaces.parent_id = base_and_descendants.id
)
SELECT
namespaces.*
FROM
base_and_descendants AS namespaces
WHERE
namespaces.id = 5995481
LIMIT 1```
### Code
<!--
Where is this query used? Link to code base. This can be added later upon investigation, too.
-->
cc @gl-database @glopezfernandez