Skip to content

Remove special handling of trace in DestroyBatchService

What does this MR do and why?

Remove special handling of trace in DestroyBatchService. This will fix the drift in ProjectStatistics build_artifacts_size seen in https://gitlab.com/gitlab-org/gitlab/-/issues/392414.

DestroyBatchService's role is to destroy all the records in the job artifact association provided to it. There should not be any special handling of any job artifacts.

Excluding the trace artifact created a problem where FastDestroy for job artifacts cannot completely destroy all the associated job artifacts. This creates inconsistent removal process between job artifacts in the same pipeline, where some are removed through DestroyBatchService and some are removed directly in database due to the cascading effect.

Usages of DestroyBatchService are updated to ensure the provide the correct scope:

  1. DestroyAllExpiredService uses non_trace scope, which ensures that trace job artifact is not deleted.
  2. DestroyAssociationsService passes all job artifacts of the build, including trace. A test is added to check for this.
  3. BuildEraseService passes all job artifacts of the build, including trace. There is an existing test for this.
  4. DeleteService passes erasable job artifacts of the build. There is an existing test for this.
  5. UpdateUnknownLockedStatusService is unchanged, because it does not touch trace job artifacts. This is a separate issue discussed in #392990

Database

New index added

CREATE INDEX index_ci_job_artifacts_expire_at_unlocked_non_trace ON ci_job_artifacts USING btree (expire_at) WHERE ((locked = 0) AND (file_type <> 3) AND (expire_at IS NOT NULL));

test index in postgres.ai:

Screenshot_2023-03-02_at_9.32.09_PM

Database queries

SELECT "ci_job_artifacts".* FROM "ci_job_artifacts" WHERE "ci_job_artifacts"."expire_at" < '2023-06-03 09:32:30.079555' AND "ci_job_artifacts"."file_type" != 3 AND "ci_job_artifacts"."locked" = 0 LIMIT 500

postgres.ai query: https://console.postgres.ai/gitlab/gitlab-production-ci/sessions/15877/commands/55161

 Limit  (cost=0.56..370.72 rows=500 width=149) (actual time=0.295..1.944 rows=500 loops=1)
   Buffers: shared hit=347 read=5
   I/O Timings: read=0.264 write=0.000
   ->  Index Scan using test_index_ci_job_artifacts_expired_unlocked_non_trace on public.ci_job_artifacts  (cost=0.56..3691848.25 rows=4986957 width=149) (actual time=0.293..1.896 rows=500 loops=1)
         Index Cond: (ci_job_artifacts.expire_at < '2023-05-03 06:16:13.791334+00'::timestamp with time zone)
         Buffers: shared hit=347 read=5
         I/O Timings: read=0.264 write=0.000

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

How to set up and validate locally

Prerequisite:

  1. create a project and run a pipeline with a single job.
  2. Ensure that the pipeline is completed and the project statistics reflect the job's trace artifact size.
# These 2 should be equal
[43] pry(main)> project.reload.statistics.build_artifacts_size
[46] pry(main)> project.reload.job_artifacts.sum(&:size)

# This should be 0
[48] pry(main)> project.statistics.counter(:build_artifacts_size).get

# destroy the pipeline
[49] pry(main)> current_user = User.find(1)
[50] pry(main)> ::Ci::DestroyPipelineService.new(project, current_user).execute(pipeline)

# validate that Redis counter is decremented by the trace artifact size (see a negative of the same number)
[51] pry(main)> project.statistics.counter(:build_artifacts_size).get

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #392414

Edited by Albert

Merge request reports