Skip to content

Expose locked artifacts for pipeline downloadable_artifact

Max Orefice requested to merge mo-downloadable-locked-artifacts into master

Ref: #329241 (closed)

What does this MR do?

This MR fixes a ~bug where downloadable Job Artifacts where not showing up on the pipeline index page.

Why are we doing this?

By default, the latest job artifacts from the most recent successful jobs are never deleted. If a job is configured with expire_in, its artifacts only expire if a more recent artifact exists.

This makes sure users are able to download job artifacts from the pipeline index page.

Screenshots (strongly suggested)

Before After
Pipelines___Administrator___Test_Codequality___GitLab_2021-05-03_15-42-05 Pipelines___Administrator___Test_Codequality___GitLab_2021-05-03_15-43-43

Database

Click to see the query introduced by this MR

Commit 1

Here is the diff between the original query and our new one:

+ INNER JOIN ci_pipelines ON ci_pipelines.id = jobs_ci_job_artifacts.commit_id
+ AND ((expire_at IS NULL OR expire_at > '2021-04-27 18:08:58.610838') OR ci_pipelines.locked = 1)

SQL plan for

 Nested Loop  (cost=1.85..5206.00 rows=14 width=1414) (actual time=36.441..36.443 rows=0 loops=1)
   Buffers: shared read=5
   I/O Timings: read=36.368
   ->  Nested Loop  (cost=1.28..5157.23 rows=15 width=150) (actual time=36.440..36.441 rows=0 loops=1)
         Buffers: shared read=5
         I/O Timings: read=36.368
         ->  Index Scan using index_ci_builds_on_commit_id_and_status_and_type on public.ci_builds  (cost=0.70..261.13 rows=229 width=4) (actual time=36.438..36.439 rows=0 loops=1)
               Index Cond: ((ci_builds.commit_id = 106) AND ((ci_builds.type)::text = 'Ci::Build'::text))
               Filter: ((NOT ci_builds.retried) OR (ci_builds.retried IS NULL))
               Rows Removed by Filter: 0
               Buffers: shared read=5
               I/O Timings: read=36.368
         ->  Index Scan using index_ci_job_artifacts_on_job_id_and_file_type on public.ci_job_artifacts  (cost=0.57..21.37 rows=1 width=146) (actual time=0.000..0.000 rows=0 loops=0)
               Index Cond: (ci_job_artifacts.job_id = ci_builds.id)
               Filter: (((ci_job_artifacts.expire_at IS NULL) OR (ci_job_artifacts.expire_at > '2021-04-28 19:06:18.740632+00'::timestamp with time zone)) AND (ci_job_artifacts.file_type = ANY ('{19,26,1,17,9,7,8,6,16,4,10,101,15,12,11,24,25,5,21,22}'::integer[])))
               Rows Removed by Filter: 0
   ->  Index Scan using ci_builds_pkey on public.ci_builds jobs_ci_job_artifacts  (cost=0.58..3.25 rows=1 width=1268) (actual time=0.000..0.000 rows=0 loops=0)
         Index Cond: (jobs_ci_job_artifacts.id = ci_job_artifacts.job_id)
         Filter: ((jobs_ci_job_artifacts.type)::text = 'Ci::Build'::text)
         Rows Removed by Filter: 0

SQL Query

SELECT
    ci_job_artifacts.id AS t0_r0,
    ci_job_artifacts.project_id AS t0_r1,
    ci_job_artifacts.job_id AS t0_r2,
    ci_job_artifacts.file_type AS t0_r3,
    ci_job_artifacts.size AS t0_r4,
    ci_job_artifacts.created_at AS t0_r5,
    ci_job_artifacts.updated_at AS t0_r6,
    ci_job_artifacts.expire_at AS t0_r7,
    ci_job_artifacts.file AS t0_r8,
    ci_job_artifacts.file_store AS t0_r9,
    ci_job_artifacts.file_sha256 AS t0_r10,
    ci_job_artifacts.file_format AS t0_r11,
    ci_job_artifacts.file_location AS t0_r12,
    ci_job_artifacts.id_convert_to_bigint AS t0_r13,
    ci_job_artifacts.job_id_convert_to_bigint AS t0_r14,
    jobs_ci_job_artifacts.id AS t1_r0,
    jobs_ci_job_artifacts.status AS t1_r1,
    jobs_ci_job_artifacts.finished_at AS t1_r2,
    jobs_ci_job_artifacts.trace AS t1_r3,
    jobs_ci_job_artifacts.created_at AS t1_r4,
    jobs_ci_job_artifacts.updated_at AS t1_r5,
    jobs_ci_job_artifacts.started_at AS t1_r6,
    jobs_ci_job_artifacts.runner_id AS t1_r7,
    jobs_ci_job_artifacts.coverage AS t1_r8,
    jobs_ci_job_artifacts.commit_id AS t1_r9,
    jobs_ci_job_artifacts.name AS t1_r10,
    jobs_ci_job_artifacts.options AS t1_r11,
    jobs_ci_job_artifacts.allow_failure AS t1_r12,
    jobs_ci_job_artifacts.stage AS t1_r13,
    jobs_ci_job_artifacts.trigger_request_id AS t1_r14,
    jobs_ci_job_artifacts.stage_idx AS t1_r15,
    jobs_ci_job_artifacts.tag AS t1_r16,
    jobs_ci_job_artifacts.ref AS t1_r17,
    jobs_ci_job_artifacts.user_id AS t1_r18,
    jobs_ci_job_artifacts.type AS t1_r19,
    jobs_ci_job_artifacts.target_url AS t1_r20,
    jobs_ci_job_artifacts.description AS t1_r21,
    jobs_ci_job_artifacts.project_id AS t1_r22,
    jobs_ci_job_artifacts.erased_by_id AS t1_r23,
    jobs_ci_job_artifacts.erased_at AS t1_r24,
    jobs_ci_job_artifacts.artifacts_expire_at AS t1_r25,
    jobs_ci_job_artifacts.environment AS t1_r26,
    jobs_ci_job_artifacts.when AS t1_r27,
    jobs_ci_job_artifacts.yaml_variables AS t1_r28,
    jobs_ci_job_artifacts.queued_at AS t1_r29,
    jobs_ci_job_artifacts.token AS t1_r30,
    jobs_ci_job_artifacts.lock_version AS t1_r31,
    jobs_ci_job_artifacts.coverage_regex AS t1_r32,
    jobs_ci_job_artifacts.auto_canceled_by_id AS t1_r33,
    jobs_ci_job_artifacts.retried AS t1_r34,
    jobs_ci_job_artifacts.stage_id AS t1_r35,
    jobs_ci_job_artifacts.protected AS t1_r36,
    jobs_ci_job_artifacts.failure_reason AS t1_r37,
    jobs_ci_job_artifacts.scheduled_at AS t1_r38,
    jobs_ci_job_artifacts.token_encrypted AS t1_r39,
    jobs_ci_job_artifacts.upstream_pipeline_id AS t1_r40,
    jobs_ci_job_artifacts.resource_group_id AS t1_r41,
    jobs_ci_job_artifacts.waiting_for_resource_at AS t1_r42,
    jobs_ci_job_artifacts.processed AS t1_r43,
    jobs_ci_job_artifacts.scheduling_type AS t1_r44
FROM
    ci_job_artifacts
    INNER JOIN ci_builds ON ci_job_artifacts.job_id = ci_builds.id
    INNER JOIN ci_builds jobs_ci_job_artifacts ON jobs_ci_job_artifacts.id = ci_job_artifacts.job_id
        AND jobs_ci_job_artifacts.type = 'Ci::Build'
    INNER JOIN ci_pipelines ON ci_pipelines.id = jobs_ci_job_artifacts.commit_id
WHERE
    ci_builds.type = 'Ci::Build'
    AND ci_builds.commit_id = 106
    AND (ci_builds.retried = FALSE
        OR ci_builds.retried IS NULL)
    AND ((expire_at IS NULL
            OR expire_at > '2021-04-28 19:06:37.879562')
        OR ci_pipelines.locked = 1)
    AND ci_job_artifacts.file_type IN (19, 26, 1, 17, 9, 7, 8, 6, 16, 4, 10, 101, 15, 12, 11, 24, 25, 5, 21, 22)

Commit 2

Here is the diff between the original query and our new one:

+ (EXISTS (
+   SELECT
+       1
+   FROM
+       ci_pipelines
+   WHERE
+       ci_pipelines.locked = 1
+       AND ci_pipelines.id = ci_builds.commit_id)))

SQL plan for

Nested Loop  (cost=1.85..9124.77 rows=26 width=1411) (actual time=12.369..12.373 rows=0 loops=1)
   Buffers: shared read=5
   I/O Timings: read=12.284
   ->  Nested Loop  (cost=1.28..1480.74 rows=294 width=1273) (actual time=12.368..12.370 rows=0 loops=1)
         Buffers: shared read=5
         I/O Timings: read=12.284
         ->  Index Scan using index_ci_builds_on_commit_id_and_status_and_type on public.ci_builds  (cost=0.70..383.50 rows=305 width=8) (actual time=12.364..12.365 rows=0 loops=1)
               Index Cond: ((ci_builds.commit_id = 107) AND ((ci_builds.type)::text = 'Ci::Build'::text))
               Filter: ((NOT ci_builds.retried) OR (ci_builds.retried IS NULL))
               Rows Removed by Filter: 0
               Buffers: shared read=5
               I/O Timings: read=12.284
         ->  Index Scan using ci_builds_pkey on public.ci_builds jobs_ci_job_artifacts  (cost=0.58..3.60 rows=1 width=1265) (actual time=0.000..0.000 rows=0 loops=0)
               Index Cond: (jobs_ci_job_artifacts.id = ci_builds.id)
               Filter: ((jobs_ci_job_artifacts.type)::text = 'Ci::Build'::text)
               Rows Removed by Filter: 0
   ->  Index Scan using index_ci_job_artifacts_on_job_id_and_file_type on public.ci_job_artifacts  (cost=0.57..18.79 rows=2 width=146) (actual time=0.000..0.000 rows=0 loops=0)
         Index Cond: (ci_job_artifacts.job_id = jobs_ci_job_artifacts.id)
         Filter: (ci_job_artifacts.file_type = ANY ('{19,26,1,17,9,7,8,6,16,4,10,101,15,12,11,24,25,5,21,22}'::integer[]))
         Rows Removed by Filter: 0
   SubPlan 1
     ->  Index Scan using ci_pipelines_pkey on public.ci_pipelines  (cost=0.57..3.59 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=0)
           Index Cond: (ci_pipelines.id = ci_builds.commit_id)
           Filter: (ci_pipelines.locked = 1)
           Rows Removed by Filter: 0

SQL Query

SELECT
    ci_job_artifacts.id AS t0_r0,
    ci_job_artifacts.project_id AS t0_r1,
    ci_job_artifacts.job_id AS t0_r2,
    ci_job_artifacts.file_type AS t0_r3,
    ci_job_artifacts.size AS t0_r4,
    ci_job_artifacts.created_at AS t0_r5,
    ci_job_artifacts.updated_at AS t0_r6,
    ci_job_artifacts.expire_at AS t0_r7,
    ci_job_artifacts.file AS t0_r8,
    ci_job_artifacts.file_store AS t0_r9,
    ci_job_artifacts.file_sha256 AS t0_r10,
    ci_job_artifacts.file_format AS t0_r11,
    ci_job_artifacts.file_location AS t0_r12,
    ci_job_artifacts.id_convert_to_bigint AS t0_r13,
    ci_job_artifacts.job_id_convert_to_bigint AS t0_r14,
    jobs_ci_job_artifacts.id AS t1_r0,
    jobs_ci_job_artifacts.status AS t1_r1,
    jobs_ci_job_artifacts.finished_at AS t1_r2,
    jobs_ci_job_artifacts.trace AS t1_r3,
    jobs_ci_job_artifacts.created_at AS t1_r4,
    jobs_ci_job_artifacts.updated_at AS t1_r5,
    jobs_ci_job_artifacts.started_at AS t1_r6,
    jobs_ci_job_artifacts.runner_id AS t1_r7,
    jobs_ci_job_artifacts.coverage AS t1_r8,
    jobs_ci_job_artifacts.commit_id AS t1_r9,
    jobs_ci_job_artifacts.name AS t1_r10,
    jobs_ci_job_artifacts.options AS t1_r11,
    jobs_ci_job_artifacts.allow_failure AS t1_r12,
    jobs_ci_job_artifacts.stage AS t1_r13,
    jobs_ci_job_artifacts.trigger_request_id AS t1_r14,
    jobs_ci_job_artifacts.stage_idx AS t1_r15,
    jobs_ci_job_artifacts.tag AS t1_r16,
    jobs_ci_job_artifacts.ref AS t1_r17,
    jobs_ci_job_artifacts.user_id AS t1_r18,
    jobs_ci_job_artifacts.type AS t1_r19,
    jobs_ci_job_artifacts.target_url AS t1_r20,
    jobs_ci_job_artifacts.description AS t1_r21,
    jobs_ci_job_artifacts.project_id AS t1_r22,
    jobs_ci_job_artifacts.erased_by_id AS t1_r23,
    jobs_ci_job_artifacts.erased_at AS t1_r24,
    jobs_ci_job_artifacts.artifacts_expire_at AS t1_r25,
    jobs_ci_job_artifacts.environment AS t1_r26,
    jobs_ci_job_artifacts.when AS t1_r27,
    jobs_ci_job_artifacts.yaml_variables AS t1_r28,
    jobs_ci_job_artifacts.queued_at AS t1_r29,
    jobs_ci_job_artifacts.token AS t1_r30,
    jobs_ci_job_artifacts.lock_version AS t1_r31,
    jobs_ci_job_artifacts.coverage_regex AS t1_r32,
    jobs_ci_job_artifacts.auto_canceled_by_id AS t1_r33,
    jobs_ci_job_artifacts.retried AS t1_r34,
    jobs_ci_job_artifacts.stage_id AS t1_r35,
    jobs_ci_job_artifacts.protected AS t1_r36,
    jobs_ci_job_artifacts.failure_reason AS t1_r37,
    jobs_ci_job_artifacts.scheduled_at AS t1_r38,
    jobs_ci_job_artifacts.token_encrypted AS t1_r39,
    jobs_ci_job_artifacts.upstream_pipeline_id AS t1_r40,
    jobs_ci_job_artifacts.resource_group_id AS t1_r41,
    jobs_ci_job_artifacts.waiting_for_resource_at AS t1_r42,
    jobs_ci_job_artifacts.processed AS t1_r43,
    jobs_ci_job_artifacts.scheduling_type AS t1_r44
FROM
    ci_job_artifacts
    INNER JOIN ci_builds ON ci_job_artifacts.job_id = ci_builds.id
    INNER JOIN ci_builds jobs_ci_job_artifacts ON jobs_ci_job_artifacts.id = ci_job_artifacts.job_id
        AND jobs_ci_job_artifacts.type = 'Ci::Build'
WHERE
    ci_builds.type = 'Ci::Build'
    AND ci_builds.commit_id = 107
    AND (ci_builds.retried = FALSE
        OR ci_builds.retried IS NULL)
    AND ((expire_at IS NULL
            OR expire_at > '2021-05-05 13:53:28.893377')
        OR (EXISTS (
                SELECT
                    1
                FROM
                    ci_pipelines
                WHERE
                    ci_pipelines.locked = 1
                    AND ci_pipelines.id = ci_builds.commit_id)))
    AND ci_job_artifacts.file_type IN (19, 26, 1, 17, 9, 7, 8, 6, 16, 4, 10, 101, 15, 12, 11, 24, 25, 5, 21, 22)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Max Orefice

Merge request reports