Skip to content

Fix query timeout when finding auto-cancelable pipelines

Fabio Pitino requested to merge fix-timeout-auto-cancelable-pipelines into master

What does this MR do?

Related to #273733 (closed)

In the attached issue we found that the query for auto-canceling pipelines was sometimes timing out since we changed the method same_family_pipeline_ids to use a recursive query to find out all pipelines in the same family including child of child pipelines. Prior to that we didn't have a recursive query because a parent pipeline could only trigger child pipelines. Today we allow child pipelines to trigger their child pipelines, hence the reason for the recursive query.

The approach taken to try to solve this is to break down the query from same_family_pipeline_ids into 2 queries:

  1. find the root ancestor if self.child?, otherwise return self. The most common scenario would be that self is not a child pipeline.
  2. find the descendants from the root ancestor. This query will always run.

Feature flag

This change is done behind a feature flag :ci_root_ancestor_for_pipeline_family that is going to be removed in #287812 (closed)

Query plans for Ci::Pipeline#same_family_pipeline_ids

With Feature Flag disabled (before):

WITH RECURSIVE "base_and_descendants" AS ((WITH RECURSIVE "base_and_ancestors" AS ((SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 208831043)
UNION
(SELECT "ci_pipelines".* FROM "ci_pipelines", "base_and_ancestors", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."source_pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."pipeline_id" = "base_and_ancestors"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT "ci_pipelines".* FROM "base_and_ancestors" AS "ci_pipelines")
UNION
(SELECT "ci_pipelines".* FROM "ci_pipelines", "base_and_descendants", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."source_pipeline_id" = "base_and_descendants"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT "id" FROM "base_and_descendants" AS "ci_pipelines";

plan: https://explain.depesz.com/s/GM6I

Time: 31.811 ms
  - planning: 1.959 ms
  - execution: 29.852 ms
    - I/O read: 29.116 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 26 (~208.00 KiB) from the buffer pool
  - reads: 12 (~96.00 KiB) from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

With Feature Flag enabled (after):

The first query will run only if the given pipeline is a child pipeline. In most of the cases it won't.

WITH RECURSIVE "base_and_ancestors" AS ((SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 208831043)
UNION
(SELECT "ci_pipelines".* FROM "ci_pipelines", "base_and_ancestors", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."source_pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."pipeline_id" = "base_and_ancestors"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT "ci_pipelines".* FROM "base_and_ancestors" AS "ci_pipelines";

plan: https://explain.depesz.com/s/cRGP

Time: 1.371 ms
  - planning: 1.194 ms
  - execution: 0.177 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 17 (~136.00 KiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

The second query will run always.

WITH RECURSIVE "base_and_descendants" AS ((SELECT "ci_pipelines".* FROM "ci_pipelines" WHERE "ci_pipelines"."id" = 208831026)
UNION
(SELECT "ci_pipelines".* FROM "ci_pipelines", "base_and_descendants", "ci_sources_pipelines" WHERE "ci_sources_pipelines"."pipeline_id" = "ci_pipelines"."id" AND "ci_sources_pipelines"."source_pipeline_id" = "base_and_descendants"."id" AND "ci_sources_pipelines"."source_project_id" = "ci_sources_pipelines"."project_id")) SELECT "id" FROM "base_and_descendants" AS "ci_pipelines";

plan: https://explain.depesz.com/s/tenh

Time: 1.714 ms
  - planning: 1.256 ms
  - execution: 0.458 ms
    - I/O read: 0.000 ms
    - I/O write: 0.000 ms

Shared buffers:
  - hits: 26 (~208.00 KiB) from the buffer pool
  - reads: 0 from the OS file cache, including disk I/O
  - dirtied: 0
  - writes: 0

Screenshots (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Fabio Pitino

Merge request reports