Ensure cleanup job artifacts task does not include pipeline artifacts

Merged Catalin Irimie requested to merge cat-fix-orphaned-jobartifacts-including-pipelines into master

What does this MR do and why?

Pipeline artifacts share the artifacts storage config and generate a similar path that instead of the date (for job artifacts) would be "pipelines", which also matches this here.

Because the cleanup job artifacts rake task is looking for job artifacts only in the database, this will not find pipeline artifacts and think they are orphaned, deleting them from disk.

Related to #351517 (closed)

Screenshots or screen recordings

Before:

╰─>$ bin/rake gitlab:cleanup:orphan_job_artifact_files
I, [2022-02-18T13:27:17.371755 #284252]  INFO -- : [DRY RUN] Looking for orphan job artifacts to clean up
I, [2022-02-18T13:27:17.371916 #284252]  INFO -- : [DRY RUN] find command: '/usr/bin/ionice -c best-effort find -L /home/catalin/gdk/gitlab/shared/artifacts -mindepth 6 -maxdepth 6 -type d'
D, [2022-02-18T13:27:17.392750 #284252] DEBUG -- : Found orphan job artifact file @ /home/catalin/gdk/gitlab/shared/artifacts/81/17/811786ad1ae74adfdd20dd0372abaaebc6246e343aebd01da0bfc4c02bf0106c/pipelines/91/artifacts
I, [2022-02-18T13:27:17.393934 #284252]  INFO -- : [DRY RUN] Processed 249 job artifact(s) to find and cleaned 1 orphan(s).
I, [2022-02-18T13:27:17.394461 #284252]  INFO -- : To clean up these files run this command with DRY_RUN=false

After:

╰─>$ bin/rake gitlab:cleanup:orphan_job_artifact_files
I, [2022-02-18T13:27:17.371755 #284252]  INFO -- : [DRY RUN] Looking for orphan job artifacts to clean up
I, [2022-02-18T13:27:17.371916 #284252]  INFO -- : [DRY RUN] find command: '/usr/bin/ionice -c best-effort find -L /home/catalin/gdk/gitlab/shared/artifacts -mindepth 6 -maxdepth 6 -not -path */pipelines/* -type d'
I, [2022-02-18T13:27:17.393934 #284252]  INFO -- : [DRY RUN] Processed 249 job artifact(s) to find and cleaned 0 orphan(s).
I, [2022-02-18T13:27:17.394461 #284252]  INFO -- : To clean up these files run this command with DRY_RUN=false

How to set up and validate locally

  1. Import by URL https://gitlab.com/gitlab-org/ci-sample-projects/coverage-report/

  2. Run a pipeline on the mo-add-demo-with-coverage branch (using ruby:2.5 as the image if it errors for you)

  3. Notice a pipeline artifact was generated:

    [15] pry(main)> Ci::PipelineArtifact.first.file.path
    Ci::PipelineArtifact Load (0.4ms)  SELECT "ci_pipeline_artifacts".* FROM "ci_pipeline_artifacts" ORDER BY "ci_pipeline_artifacts"."id" ASC LIMIT 1 /*application:console,db_config_name:main,line:(pry):15:in `__pry__'*/
    => "/home/catalin/gdk/gitlab/shared/artifacts/81/17/811786ad1ae74adfdd20dd0372abaaebc6246e343aebd01da0bfc4c02bf0106c/pipelines/91/artifacts/1/code_coverage.json"
  4. Run bin/rake gitlab:cleanup:orphan_job_artifact_files and notice it would get deleted

  5. Checkout this branch, verify that bin/rake gitlab:cleanup:orphan_job_artifact_files doesn't delete this anymore

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Catalin Irimie