Introduce Pipelines::AutoCleanupService class
What does this MR do and why?
Introduces Pipelines::AutoCleanupService to fix statement timeouts in Ci::DestroyOldPipelinesWorker by optimizing pipeline deletion queries.
The current implementation causes PG::QueryCanceled errors due to inefficient queries when deleting old pipelines. This service addresses the performance issue by:
- Splitting queries by pipeline status to leverage database indexes efficiently (completed statuses queried individually, non-completed statuses grouped together)
- Using keyset pagination with
created_atandidordering to avoid expensive merge operations - Implementing per-status caching to track deletion progress and resume from the last processed timestamp
- Filtering protected and locked pipelines based on feature flags
Note: This service is not yet integrated with Ci::DestroyOldPipelinesWorker. Will be added in the next MR.
This approach ensures queries complete within statement timeout limits while maintaining deletion progress across worker executions.
References
Screenshots or screen recordings
N/A
How to set up and validate locally
- Create a project with old pipelines across different statuses
- Configure
ci_delete_pipelines_in_secondsretention policy - Execute
Ci::DestroyOldPipelinesService.new(project: project).execute - Verify pipelines older than the retention period are deleted
- Check Redis cache contains per-status timestamps for resumption
1. Setup - Create Project and Pipelines
# Create a project with 90-day retention period
timestamp = Time.now.to_i
unique_path = "test-pipeline-delete-#{timestamp}"
namespace = FactoryBot.create(:namespace, path: unique_path)
project = FactoryBot.create(
:project,
name: unique_path,
path: unique_path,
namespace: namespace,
ci_delete_pipelines_in_seconds: 90.days.to_i
)
# Create ancient pipelines (1 year old) - should be deleted
ancient_success = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'success',
created_at: 1.year.ago,
locked: :unlocked
)
ancient_failed = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'failed',
created_at: 1.year.ago,
locked: :unlocked
)
ancient_running = FactoryBot.create_list(:ci_pipeline, 5,
project: project,
status: 'running',
created_at: 1.year.ago,
locked: :unlocked
)
# Create old pipelines (6 months old) - should be deleted
old_success = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'success',
created_at: 6.months.ago,
locked: :unlocked
)
old_failed = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'failed',
created_at: 6.months.ago,
locked: :unlocked
)
# Create recent pipelines (1 week old) - should NOT be deleted
recent_success = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'success',
created_at: 1.week.ago,
locked: :unlocked
)
recent_failed = FactoryBot.create_list(:ci_pipeline, 10,
project: project,
status: 'failed',
created_at: 1.week.ago,
locked: :unlocked
)
2. Verify Pipeline Counts Before
# Check total counts
project.all_pipelines.count
# => Should be 65
# Check by status
project.all_pipelines.where(status: 'success').count
# => Should be 30
project.all_pipelines.where(status: 'failed').count
# => Should be 30
# Check by age
cutoff = 90.days.ago
project.all_pipelines.where('created_at < ?', cutoff).count
# => Should be 45 (ancient + old)
project.all_pipelines.where('created_at >= ?', cutoff).count
# => Should be 20 (recent)
3. Run the Service
service = Ci::Pipelines::AutoCleanupService.new(project: project)
result = service.execute
# Check the result
result.success?
# => true
result.payload[:destroyed_pipelines_size]
# => Should be 45
4. Verify Pipeline Counts After
# Total should decrease by 45
project.reload.all_pipelines.count
# => Should be 20
# Only recent pipelines should remain
project.all_pipelines.where(status: 'success').count
# => Should be 10
project.all_pipelines.where(status: 'failed').count
# => Should be 10
# Verify all remaining pipelines are recent
project.all_pipelines.where('created_at < ?', 90.days.ago).count
# => Should be 0
5. Check Cache
# Read the cache
cache = Ci::RetentionPolicies::PipelineDeletionCutoffCache.new(project: project).read
# Inspect cache values
cache.keys
# => ["success", "failed", "canceled", "skipped", "manual", "other"]
cache.values.map(&:to_date)
# Should show dates around 6 months ago (the most recent deleted pipelines)
# => [Tue, 13 May 2025, Tue, 13 May 2025, Fri, 15 Aug 2025, Fri, 15 Aug 2025, Fri, 15 Aug 2025, Wed, 13 Nov 2024]
6. Test with Protected/Locked Pipelines
# Create protected pipelines that should be skipped
protected_ancient = FactoryBot.create_list(:ci_pipeline, 5,
:protected,
project: project,
status: 'success',
created_at: 1.year.ago,
locked: :unlocked
)
# Create locked pipelines that should be skipped
locked_ancient = FactoryBot.create_list(:ci_pipeline, 5,
:artifacts_locked,
project: project,
status: 'success',
created_at: 1.year.ago
)
# Run service again
before_count = project.reload.all_pipelines.count
service = Ci::Pipelines::AutoCleanupService.new(project: project)
service.execute
after_count = project.reload.all_pipelines.count
# Verify protected/locked pipelines were NOT deleted
before_count - after_count
# => Should be 0 (or minimal if there were other unprotected/unlocked old pipelines)
# Verify protected/locked pipelines still exist
Ci::Pipeline.where(id: protected_ancient.map(&:id)).count
# => Should be 5
Ci::Pipeline.where(id: locked_ancient.map(&:id)).count
# => Should be 5
7. Clean Up
# Clean up test data when done
project.destroy!
namespace.destroy!
Complete Testing Snippet:
Here's a simple snippet to copy-paste in rails console:
# === Setup ===
timestamp = Time.now.to_i
unique_path = "test-pipeline-delete-#{timestamp}"
namespace = FactoryBot.create(:namespace, path: unique_path)
project = FactoryBot.create(
:project,
name: unique_path,
path: unique_path,
namespace: namespace,
ci_delete_pipelines_in_seconds: 90.days.to_i
)
# Create pipelines at different ages
ancient = FactoryBot.create_list(:ci_pipeline, 20, project: project, status: 'success', created_at: 1.year.ago, locked: :unlocked)
old = FactoryBot.create_list(:ci_pipeline, 15, project: project, status: 'failed', created_at: 6.months.ago, locked: :unlocked)
recent = FactoryBot.create_list(:ci_pipeline, 10, project: project, status: 'success', created_at: 1.week.ago, locked: :unlocked)
puts "Before: #{project.all_pipelines.count} pipelines"
# => 45
# === Execute Service ===
service = Ci::Pipelines::AutoCleanupService.new(project: project)
result = service.execute
puts "Deleted: #{result.payload[:destroyed_pipelines_size]} pipelines"
# => 35
puts "After: #{project.reload.all_pipelines.count} pipelines"
# => 10 (only recent remain)
# === Verify Cache ===
cache = Ci::RetentionPolicies::PipelineDeletionCutoffCache.new(project: project).read
puts "Cache timestamps: #{cache.values.map { |t| t.strftime('%Y-%m-%d') }.uniq}"
# === Cleanup ===
project.destroy!
namespace.destroy!
Database Query Plans
Query Plans
With Cutoff Timestamp
Status: success
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" = 'success'
AND "p_ci_pipelines"."created_at" < '2025-10-15 00:00:00'
AND "p_ci_pipelines"."created_at" >= '2025-09-14 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139371
Status: failed
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" = 'failed'
AND "p_ci_pipelines"."created_at" < '2025-10-15 00:00:00'
AND "p_ci_pipelines"."created_at" >= '2025-09-14 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139373
Status: non-completed
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" IN ('created', 'waiting_for_resource', 'preparing', 'waiting_for_callback', 'pending', 'running', 'canceling', 'scheduled')
AND "p_ci_pipelines"."created_at" < '2025-10-15 00:00:00'
AND "p_ci_pipelines"."created_at" >= '2025-08-14 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139374
Without Cutoff Timestamp
Status: success
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" = 'success'
AND "p_ci_pipelines"."created_at" < '2025-08-15 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139378
Status: failed
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" = 'failed'
AND "p_ci_pipelines"."created_at" < '2025-08-15 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139379
Status: non-completed
SELECT "p_ci_pipelines".*
FROM "p_ci_pipelines"
WHERE "p_ci_pipelines"."project_id" = 278964
AND "p_ci_pipelines"."status" IN ('created', 'waiting_for_resource', 'preparing', 'waiting_for_callback', 'pending', 'running', 'canceling', 'scheduled')
AND "p_ci_pipelines"."created_at" < '2025-08-15 00:00:00'
ORDER BY "p_ci_pipelines"."created_at" ASC, "p_ci_pipelines"."id" ASC
LIMIT 50;
Query Plan: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/45482/commands/139380
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.