Investigate slow container_expiration_policies query
fingerprint b7c0024aa88dcbd2
queryid: 3307835084374000612
query:
/*application:sidekiq,correlation_id:89c36e7a176e42c1f5096d8d29d97979,jid:31e64940ad0f412767ed2059,endpoint_id:ContainerExpirationPolicies::CleanupContainerRepositoryWorker,db_config_name:main*/ SELECT "container_repositories".* FROM "container_repositories" INNER JOIN container_expiration_policies ON container_repositories.project_id = container_expiration_policies.project_id WHERE "container_expiration_policies"."enabled" = $1 AND "container_repositories"."expiration_policy_cleanup_status" IN ($2, $3) AND (container_repositories.expiration_policy_started_at IS NULL OR container_repositories.expiration_policy_started_at < container_expiration_policies.next_run_at) AND (container_expiration_policies.next_run_at < $4) ORDER BY "container_repositories"."expiration_policy_cleanup_status" ASC, "container_repositories"."expiration_policy_started_at" ASC LIMIT $5 FOR UPDATE SKIP LOCKED
https://log.gprd.gitlab.net/goto/833e48c0-de3a-11ed-8afc-c9851e4645c0 shows that the slow queries mostly originates from ContainerExpirationPolicies::CleanupContainerRepositoryWorker
Primary Database CPU usage peaks
https://dashboards.gitlab.net/d/bd2Kl9Imk/host-stats?viewPanel=8&orgId=1&var-env=gprd&var-node=patroni-main-2004-04-db-gprd.c.gitlab-production.internal&from=1681819327535&to=1681841011582 shows a lot of peaks close to 80% on the primary Database server:
Thanos shows this query possibly contributing:
Edited by Thong Kuah