Geo SQL query for counting the event log is very slow
Geo uses the following query:
SELECT COUNT(*) FROM "geo_event_log"
This query on average takes 4.1 seconds to execute, with a minimum of 2.7 seconds, and a maximum of 5.2 seconds. Running this query manually results in it timing out, unless you increase the statement timeout to at least 20 seconds.
The Grafana data for this query can be found here: https://performance.gitlab.net/dashboard/db/postgres-single-query-drill-down?orgId=1&var-environment=prd&var-queryid=4007430182&var-fqdn=All
The query plan for this query is as follows:
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=719843.30..719843.31 rows=1 width=8) (actual time=20055.506..20055.506 rows=1 loops=1)
-> Seq Scan on geo_event_log (cost=0.00..629603.24 rows=36096024 width=0) (actual time=0.011..10652.376 rows=36018221 loops=1)
Planning time: 0.076 ms
Execution time: 20055.588 ms
This query itself can't be optimised, instead it has to be replaced with one of the following:
- An approximate count using PostgreSQL's tuple statistics
- An incremental counter that is stored and maintained using a separate table
We can't use Redis because then we still have to run the above query to populate the counter with an initial value.