Use new locked column on ci_job_artifacts in DestroyAllExpiredService
What does this MR do and why?
Adds a feature flag (:ci_destroy_unlocked_job_artifacts
) which when enabled, changes DestroyAllExpiredService to using the new locked
column on ci_job_artifacts
to mark unlocked and expired artifacts for deletion.
This is intended to resolve #327281 (closed) which describes how the performance of using the locked
column on ci_pipelines
necessitating a join ci_pipelines -> ci_builds -> ci_job_artifacts
is causing us to fall significantly behind on marking artifacts for deletion, which results in us storing an estimated >800TB
of unnecessary objects.
This MR will only cover marking job artifacts for deletion if they are unlocked, expired, and were created after !70235 (merged) was merged. This is expected, because the intent is to stop the problem from growing worse, and the follow up issue to clean up old job artifacts is intended to be addressed in #322817 (closed).
Feature Flag
ci_destroy_unlocked_job_artifacts
: #338165 (closed)
Database
Query (https://console.postgres.ai/shared/b0ef3b07-89b2-42b5-bcc9-27739a3ad4e6):
SELECT
"ci_job_artifacts"."project_id",
"ci_job_artifacts"."file_type",
"ci_job_artifacts"."size",
"ci_job_artifacts"."created_at",
"ci_job_artifacts"."updated_at",
"ci_job_artifacts"."expire_at",
"ci_job_artifacts"."file",
"ci_job_artifacts"."file_store",
"ci_job_artifacts"."file_sha256",
"ci_job_artifacts"."file_format",
"ci_job_artifacts"."file_location",
"ci_job_artifacts"."id",
"ci_job_artifacts"."job_id",
"ci_job_artifacts"."locked"
FROM
"ci_job_artifacts"
WHERE
"ci_job_artifacts"."expire_at" < '2021-10-15 00:13:30.139913'
AND "ci_job_artifacts"."locked" = 0
LIMIT 100;
Statistics:
Time: 3.924 ms
- planning: 3.886 ms
- execution: 0.038 ms
- I/O read: 0.000 ms
- I/O write: 0.000 ms
Shared buffers:
- hits: 1 (~8.00 KiB) from the buffer pool
- reads: 0 from the OS file cache, including disk I/O
- dirtied: 0
- writes: 0
Explain:
Limit (cost=0.12..3.06 rows=1 width=140) (actual time=0.004..0.005 rows=0 loops=1)
Buffers: shared hit=1
I/O Timings: read=0.000 write=0.000
-> Index Scan using ci_job_artifacts_expire_at_unlocked_idx on public.ci_job_artifacts (cost=0.12..3.06 rows=1 width=140) (actual time=0.002..0.003 rows=0 loops=1)
Index Cond: (ci_job_artifacts.expire_at < '2021-10-15 00:13:30.139913+00'::timestamp with time zone)
Buffers: shared hit=1
I/O Timings: read=0.000 write=0.000
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.
Related to #327281 (closed)