Skip to content

Finding repository by path is slower than expected

Context

On the application side, we count executions and measure the latency of all database queries. Each query is identified by a unique name. These results are then available as Prometheus metrics. For this particular query, the name is repository_find_by_path and the SQL is the following:

SELECT
    id,
    top_level_namespace_id,
    name,
    path,
    parent_id,
    created_at,
    updated_at
 FROM
    repositories
 WHERE
    path = $1;

The repositories table is not partitioned and path has a unique index on it with a btree index (structure.sql - it takes a fair amount of time to take you to the right line in the file).

Using a read-only replica of the production database I could extract this query plan:

                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using unique_repositories_path on repositories  (cost=0.27..3.29 rows=1 width=111) (actual time=0.023..0.024 rows=1 loops=1)
   Index Cond: (path = '<truncated>'::text)
   Buffers: shared hit=3
 Planning Time: 0.078 ms
 Execution Time: 0.043 ms
(5 rows)

This confirms that the query is expected to be very fast.

Problem

I've been monitoring the performance of the database queries during the percentage-based rollout and I've been puzzled about this one.

Looking at the last 24h, this is currently the worst-performing query on the p90, with an average of ~40ms and a max of ~170ms, which is by far the most significant deviation between average and max among all queries (dashboard):

image

I couldn't find a justification for this so far.

Edited by João Pereira