Query timeout on Ci::PipelineArtifacts::ExpireArtifactsWorker
Ci::PipelineArtifacts::ExpireArtifactsWorker
has been raising intermittent alerts with violated SLO.
Upon closer look, we can see query timeouts:
- https://log.gprd.gitlab.net/goto/13219f80-d1e5-11ed-a017-0d32180b1390
- https://sentry.gitlab.net/gitlab/gitlabcom/issues/4090708/?query=is%3Aunresolved%20%22Ci%3A%3APipelineArtifacts%3A%3AExpireArtifactsWorker%22
SELECT "ci_pipeline_artifacts".*
FROM "ci_pipeline_artifacts"
INNER JOIN "ci_pipelines"
ON "ci_pipelines"."id" = "ci_pipeline_artifacts"."pipeline_id"
WHERE "ci_pipelines"."locked" = $1
AND "ci_pipeline_artifacts"."expire_at" < $2 LIMIT $3
Todo
-
Remove the code related to the legacy query -
Update the specs accordingly
--- app/services/ci/pipeline_artifacts/destroy_all_expired_service.rb
+++ app/services/ci/pipeline_artifacts/destroy_all_expired_service.rb
@@ -21,8 +21,6 @@ def initialize
def execute
in_lock(EXCLUSIVE_LOCK_KEY, ttl: LOCK_TIMEOUT, retries: 1) do
destroy_unlocked_pipeline_artifacts
-
- legacy_destroy_pipeline_artifacts
end
@removed_artifacts_count
@@ -40,19 +38,6 @@ def destroy_unlocked_pipeline_artifacts
end
end
- def legacy_destroy_pipeline_artifacts
- loop_until(timeout: LOOP_TIMEOUT, limit: LOOP_LIMIT) do
- destroy_artifacts_batch
- end
- end
-
- def destroy_artifacts_batch
- artifacts = ::Ci::PipelineArtifact.unlocked.expired.limit(BATCH_SIZE).to_a
- return false if artifacts.empty?
-
- destroy_batch(artifacts)
- end
-
Edited by Max Orefice