Skip to content

Improve orphan final artifacts finder rake task performance

Erick Bajao requested to merge eb-improve-clean-up-rake-task-query into master

This is adding some minor improvements to the rake task that generates the list of orphan objects.

This is based on feedback from https://gitlab.com/gitlab-com/gl-infra/production/-/issues/17579#note_1771422415:

I've reviewed the logic, and I suspect the bucket scan isn't in fact the bottleneck. We're most likely limited by the job_artifact_record_exists? call. Increasing concurrency will put too much pressure on postgres.

I think the main optimization we should look into is batching up those calls. Basically, check existence of 1k job artifact records in a single query.

I also added some minor changes to batch size and redis marker TTL that I forgot to add before. This was from the suggestions in gitlab-com/gl-infra/production#17383 (comment 1738630685).

Merge request reports