Skip to content

Fix statement timeouts in Ci::DestroyOldPipelinesWorker

Problem

Ci::DestroyOldPipelinesWorker is the worker used by the CI retention policies to delete pipelines older than the configured project-level threshold.

This worker is sometimes failing with PG::QueryCanceled: ERROR: canceling statement due to statement timeout and it may be failing consistently on some projects, causing their old pipelines to never be deleted.

Logs: https://log.gprd.gitlab.net/app/r/s/cpKad

Some of the queries showing in the logs are:

SELECT "p_ci_job_artifacts".* FROM "p_ci_job_artifacts" INNER JOIN "p_ci_builds" ON "p_ci_job_artifacts"."job_id" = "p_ci_builds"."id" WHERE "p_ci_builds"."type" = $1 AND "p_ci_builds"."commit_id" = $2 AND "p_ci_builds"."partition_id" = $3 AND "p_ci_job_artifacts"."partition_id" = $4 AND "p_ci_job_artifacts"."id" >= $5 AND "p_ci_job_artifacts"."id" < $6
SELECT "p_ci_pipelines".* FROM "p_ci_pipelines" WHERE "p_ci_pipelines"."project_id" = $1 AND "p_ci_pipelines"."created_at" < $2 AND ("p_ci_pipelines".protected IS NOT true) LIMIT $3

Proposal

Improve the queries in Ci::DestroyOldPipelinesWorker or Ci::DestroyPipelineService or the way we iterate over the data so that it does not timeout.

Edited by 🤖 GitLab Bot 🤖