Skip to content

Lazy filtered shared runner builds [RUN ALL RSPEC] [RUN AS-IF-FOSS]

Adam Hegyi requested to merge lazy_filtered_shared_runner_builds_experiment into master

What does this MR do?

This MR batches the Ci::Build query for getting records for the shared runner. The batching logic is applied on ci_builds query and the extra joins and filters are applied on the batched query.

The goal is to avoid statement timeouts by running smaller queries. This will likely increase the API response time since we need multiple DB round trips.

The change is behind a FF: lazy_filtered_shared_runner_builds

Database

The batching queries depend on the newly introduced index_ci_builds_runner_id_pending_covering index. (takes hours to create it on db lab)

EachBatch queries:

First id:

SELECT "ci_builds"."id"
FROM "ci_builds"
WHERE "ci_builds"."type" = 'Ci::Build'
  AND ("ci_builds"."status" IN ('pending'))
  AND "ci_builds"."runner_id" IS NULL
ORDER BY "ci_builds"."id" ASC
LIMIT 1

https://explain.depesz.com/s/PcjD

Next id:

SELECT "ci_builds"."id"
FROM "ci_builds"
WHERE "ci_builds"."type" = 'Ci::Build'
  AND ("ci_builds"."status" IN ('pending'))
  AND "ci_builds"."runner_id" IS NULL
  AND "ci_builds"."id" >= 9
ORDER BY "ci_builds"."id" ASC
LIMIT 1
OFFSET 1000

https://explain.depesz.com/s/vbVf

Batched query:

SELECT ci_builds.*,
       project_builds.running_builds
FROM "ci_builds"
INNER JOIN "projects" ON "projects"."id" = "ci_builds"."project_id"
LEFT JOIN project_features ON ci_builds.project_id = project_features.project_id
LEFT JOIN
  (SELECT "ci_builds"."project_id",
          count(*) AS running_builds
   FROM "ci_builds"
   WHERE "ci_builds"."type" = 'Ci::Build'
     AND ("ci_builds"."status" IN ('running'))
     AND "ci_builds"."runner_id" IN
       (SELECT "ci_runners"."id"
        FROM "ci_runners"
        WHERE "ci_runners"."runner_type" = 1)
   GROUP BY "ci_builds"."project_id") AS project_builds ON ci_builds.project_id=project_builds.project_id
WHERE ("ci_builds"."status" IN ('pending'))
  AND "ci_builds"."runner_id" IS NULL
  AND "ci_builds"."id" >= 9
  AND "ci_builds"."id" < 10
  AND "projects"."shared_runners_enabled" = TRUE
  AND "projects"."pending_delete" = FALSE
  AND (project_features.builds_access_level IS NULL
       OR project_features.builds_access_level > 0)
  AND "ci_builds"."type" = 'Ci::Build'
  AND ("projects"."visibility_level" = 20
       OR (EXISTS
             (SELECT 1
              FROM "namespaces"
              INNER JOIN namespaces AS project_namespaces ON project_namespaces.id = projects.namespace_id
              LEFT JOIN namespace_statistics ON namespace_statistics.namespace_id = namespaces.id
              WHERE (namespaces.id = project_namespaces.traversal_ids[1])
                AND (COALESCE(namespaces.shared_runners_minutes_limit, 0, 0) = 0
                     OR COALESCE(namespace_statistics.shared_runners_seconds, 0) < COALESCE((namespaces.shared_runners_minutes_limit + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), (0 + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), 0) * 60))))
  AND (NOT EXISTS
         (SELECT 1
          FROM "taggings"
          WHERE "taggings"."taggable_type" = 'CommitStatus'
            AND "taggings"."context" = 'tags'
            AND (taggable_id = ci_builds.id)
            AND 1=1))

Measurements were taken from PRD (data is constantly changing there so taking an accurate snapshot is not easy.

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Adam Hegyi

Merge request reports