Finding repository by path is slower than expected
Context
On the application side, we count executions and measure the latency of all database queries. Each query is identified by a unique name. These results are then available as Prometheus metrics. For this particular query, the name is repository_find_by_path
and the SQL is the following:
SELECT
id,
top_level_namespace_id,
name,
path,
parent_id,
created_at,
updated_at
FROM
repositories
WHERE
path = $1;
The repositories
table is not partitioned and path
has a unique index on it with a btree
index (structure.sql - it takes a fair amount of time to take you to the right line in the file).
Using a read-only replica of the production database I could extract this query plan:
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Index Scan using unique_repositories_path on repositories (cost=0.27..3.29 rows=1 width=111) (actual time=0.023..0.024 rows=1 loops=1)
Index Cond: (path = '<truncated>'::text)
Buffers: shared hit=3
Planning Time: 0.078 ms
Execution Time: 0.043 ms
(5 rows)
This confirms that the query is expected to be very fast.
Problem
I've been monitoring the performance of the database queries during the percentage-based rollout and I've been puzzled about this one.
Looking at the last 24h, this is currently the worst-performing query on the p90, with an average of ~40ms and a max of ~170ms, which is by far the most significant deviation between average and max among all queries (dashboard):
I couldn't find a justification for this so far.