Skip to content

Resolve DB Query Timeout on Environment page

Bala Kumar requested to merge database-query-timeout-fix-environment-page into master

What does this MR do and why?

Fixes timeout happening from the below query

SELECT
    DISTINCT ON (environment_id) deployments.*
FROM
    "deployments"
WHERE
    "deployments"."status" = 2
    AND "deployments"."environment_id" IN (137, 8098355)
ORDER BY
    environment_id,
    deployments.id DESC

which is executed when last_deployment is preloaded for EnvironmentSerializer.

The current query takes 2.783 min and more details are shared in #345672 (closed)

We also tried the following indexes and they were not helpful:

exec CREATE INDEX index_environment_id_deployment_id_status ON deployments USING btree (environment_id, id, status);

exec CREATE INDEX index_status_environment_id_deployment_id ON deployments USING btree (status, environment_id, id);

See https://gitlab.slack.com/archives/CLJMDRD8C/p1638535499048600 and https://gitlab.slack.com/archives/CLJMDRD8C/p1638615920056900 and the query was not using these index.

So we arrived at using a preload with Union strategy as discussed in and this makes the query faster.

The new query with UNION looks like

SELECT
    "deployments".*
FROM
    (
        (
            SELECT
                DISTINCT ON (environment_id) deployments.*
            FROM
                "deployments"
            WHERE
                "deployments"."environment_id" = 137
                AND "deployments"."status" = 2
            ORDER BY
                environment_id,
                deployments.id DESC
            LIMIT
                1
        )
        UNION
            (
                SELECT
                    DISTINCT ON (environment_id) deployments.*
                FROM
                    "deployments"
                WHERE
                    "deployments"."environment_id" = 8098355
                    AND "deployments"."status" = 2
                ORDER BY
                    environment_id,
                    deployments.id DESC
                LIMIT
                    1
            )
    ) deployments

and finishes in 10.801 ms.

Query with UNION execution plan in cold cache
Unique  (cost=4.88..4.97 rows=2 width=188) (actual time=5.487..5.502 rows=2 loops=1)
  Buffers: shared hit=27 read=9 dirtied=1
  I/O Timings: read=5.148 write=0.000
  ->  Sort  (cost=4.88..4.89 rows=2 width=188) (actual time=5.487..5.491 rows=2 loops=1)
        Sort Key: deployments.id, deployments.iid, deployments.project_id, deployments.environment_id, deployments.ref, deployments.tag, deployments.sha, deployments.user_id, deployments.deployable_type, deployments.created_at, deployments.updated_at, deployments.on_stop, deployments.finished_at, deployments.status, deployments.cluster_id, deployments.deployable_id, deployments.archived
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=27 read=9 dirtied=1
        I/O Timings: read=5.148 write=0.000
        ->  Append  (cost=0.57..4.87 rows=2 width=188) (actual time=3.652..5.412 rows=2 loops=1)
              Buffers: shared hit=4 read=9 dirtied=1
              I/O Timings: read=5.148 write=0.000
              ->  Limit  (cost=0.57..2.87 rows=1 width=145) (actual time=3.651..3.653 rows=1 loops=1)
                    Buffers: shared hit=3 read=5 dirtied=1
                    I/O Timings: read=3.431 write=0.000
                    ->  Unique  (cost=0.57..329742.26 rows=143321 width=145) (actual time=3.648..3.650 rows=1 loops=1)
                          Buffers: shared hit=3 read=5 dirtied=1
                          I/O Timings: read=3.431 write=0.000
                          ->  Index Scan using index_deployments_on_environment_id_status_and_id on public.deployments  (cost=0.57..329742.26 rows=241641 width=145) (actual time=3.645..3.645 rows=1 loops=1)
                                Index Cond: ((deployments.environment_id = 137) AND (deployments.status = 2))
                                Buffers: shared hit=3 read=5 dirtied=1
                                I/O Timings: read=3.431 write=0.000
              ->  Limit  (cost=0.57..1.97 rows=1 width=145) (actual time=1.754..1.755 rows=1 loops=1)
                    Buffers: shared hit=1 read=4
                    I/O Timings: read=1.716 write=0.000
                    ->  Unique  (cost=0.57..468.36 rows=334 width=145) (actual time=1.752..1.753 rows=1 loops=1)
                          Buffers: shared hit=1 read=4
                          I/O Timings: read=1.716 write=0.000
                          ->  Index Scan using index_deployments_on_environment_id_status_and_id on public.deployments deployments_1  (cost=0.57..468.36 rows=334 width=145) (actual time=1.748..1.748 rows=1 loops=1)
                                Index Cond: ((deployments_1.environment_id = 8098355) AND (deployments_1.status = 2))
                                Buffers: shared hit=1 read=4
                                I/O Timings: read=1.716 write=0.000

How to set up and validate locally

  • In Rails console enable the feature Feature.enable(:custom_preloader_for_deployments)
  • Visit the environments page for a project in local. Eg. http://localhost:3000/root/rails-template/-/environments
  • We should see the queries with UNION being preloaded in the environment serializer when viewed with the performance bar as referenced in this screenshot.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #345672 (closed)

Edited by Bala Kumar

Merge request reports