Lazily load pipeline artifact in project pipelines index page

Problem

Spin-off from a performance analytics in https://gitlab.com/gitlab-org/gitlab/-/issues/321788#note_520871433.

Recently, when a user accesses a project pipelines index page, for example gitlab-org/gitlab, all the artifacts of all builds of the pipelines in the page are eagerly loaded. In the linked analytics, the pipelines page on staging the following table demonstrates the number of artifacts per pipeline of the first 10 pages.

[29, 0, 211, 196, 196, 213, 196, 196, 196, 196, 196, 195, 197, 195, 196, 213, 196, 0, 197, 213]
[213, 0, 195, 194, 0, 96, 189, 195, 194, 21, 209, 211, 195, 196, 194, 211, 212, 0, 33, 196]
[196, 211, 211, 208, 210, 0, 0, 208, 194, 189, 106, 21, 196, 101, 21, 0, 191, 0, 194, 213]
[0, 106, 213, 213, 211, 194, 196, 0, 0, 213, 196, 0, 196, 0, 196, 106, 196, 196, 197, 0]
[210, 197, 192, 197, 0, 213, 196, 0, 196, 196, 210, 196, 196, 194, 194, 194, 21, 197, 21, 196]
[196, 0, 196, 196, 213, 210, 211, 210, 208, 210, 196, 210, 209, 208, 196, 197, 196, 197, 213, 209]
[211, 195, 197, 33, 196, 196, 196, 196, 196, 196, 194, 194, 213, 196, 194, 196, 106, 0, 196, 0]
[196, 196, 0, 211, 197, 211, 194, 197, 212, 197, 196, 197, 196, 197, 175, 190, 173, 174, 176, 189]
[190, 176, 174, 0, 176, 175, 192, 175, 176, 189, 176, 0, 174, 173, 173, 174, 0, 175, 175, 175]
[0, 189, 192, 189, 0, 175, 173, 162, 176, 175, 175, 175, 176, 189, 192, 176, 184, 175, 192, 187]

The preloaded artifacts per page vary from 2629 to 3881 and even more in a crowded page. This preloading generates a query like this:

SELECT "ci_job_artifacts"."id" AS t0_r0, "ci_job_artifacts"."project_id" AS t0_r1, "ci_job_artifacts"."job_id" AS t0_r2, "ci_job_artifacts"."file_type" AS t0_r3, "ci_job_artifacts"."size" AS t0_r4, "ci_job_artifacts"."created_at" AS t0_r5, "ci_job_artifacts"."updated_at" AS t0_r6, "ci_job_artifacts"."expire_at" AS t0_r7, "ci_job_artifacts"."file" AS t0_r8, "ci_job_artifacts"."file_store" AS t0_r9, "ci_job_artifacts"."file_sha256" AS t0_r10, "ci_job_artifacts"."file_format" AS t0_r11, "ci_job_artifacts"."file_location" AS t0_r12, "ci_builds"."id" AS t1_r0, "ci_builds"."status" AS t1_r1, "ci_builds"."finished_at" AS t1_r2, "ci_builds"."trace" AS t1_r3, "ci_builds"."created_at" AS t1_r4, "ci_builds"."updated_at" AS t1_r5, "ci_builds"."started_at" AS t1_r6, "ci_builds"."runner_id" AS t1_r7, "ci_builds"."coverage" AS t1_r8, "ci_builds"."commit_id" AS t1_r9, "ci_builds"."name" AS t1_r10, "ci_builds"."options" AS t1_r11, "ci_builds"."allow_failure" AS t1_r12, "ci_builds"."stage" AS t1_r13, "ci_builds"."trigger_request_id" AS t1_r14, "ci_builds"."stage_idx" AS t1_r15, "ci_builds"."tag" AS t1_r16, "ci_builds"."ref" AS t1_r17, "ci_builds"."user_id" AS t1_r18, "ci_builds"."type" AS t1_r19, "ci_builds"."target_url" AS t1_r20, "ci_builds"."description" AS t1_r21, "ci_builds"."project_id" AS t1_r22, "ci_builds"."erased_by_id" AS t1_r23, "ci_builds"."erased_at" AS t1_r24, "ci_builds"."artifacts_expire_at" AS t1_r25, "ci_builds"."environment" AS t1_r26, "ci_builds"."when" AS t1_r27, "ci_builds"."yaml_variables" AS t1_r28, "ci_builds"."queued_at" AS t1_r29, "ci_builds"."token" AS t1_r30, "ci_builds"."lock_version" AS t1_r31, "ci_builds"."coverage_regex" AS t1_r32, "ci_builds"."auto_canceled_by_id" AS t1_r33, "ci_builds"."retried" AS t1_r34, "ci_builds"."stage_id" AS t1_r35, "ci_builds"."protected" AS t1_r36, "ci_builds"."failure_reason" AS t1_r37, "ci_builds"."scheduled_at" AS t1_r38, "ci_builds"."token_encrypted" AS t1_r39, "ci_builds"."upstream_pipeline_id" AS t1_r40, "ci_builds"."resource_group_id" AS t1_r41, "ci_builds"."waiting_for_resource_at" AS t1_r42, "ci_builds"."processed" AS t1_r43, "ci_builds"."scheduling_type" AS t1_r44 
FROM "ci_job_artifacts" INNER JOIN "ci_builds" ON "ci_builds"."id" = "ci_job_artifacts"."job_id" AND "ci_builds"."type" = 'Ci::Build' 
WHERE 
  (expire_at IS NULL OR expire_at > '2021-03-03 07:58:33.027404') AND 
  "ci_job_artifacts"."file_type" IN (19, 26, 1, 17, 9, 7, 8, 6, 16, 4, 10, 101, 15, 12, 11, 24, 25, 5, 21, 22) AND 
  "ci_job_artifacts"."job_id" IN (38622859, 38622842, 38622949, 38622858, 38622836, 38622837, 38622956, 38622924, 38622864, 38622863, 38622866, 38622865, 38622839, 38622840, 38622835, 38622860, 38622950, 38622846, 38622955, 38622848, 38622849, 38622850, 38622851, 38622852, 38622847, 38622867, 38622954, 38622855, 38622854, 38622861, 38622856, 38622857, 38622841, 38622843, 38622892, 38622893, 38622894, 38622895, 38622896, 38622897, 38622898, 38622899, 38622900, 38622909, 38622910, 38622911, 38622912, 38622913, 38622914, 38622915, 38622916, 38622917, 38622918, 38622901, 38622919, 38622920, 38622921, 38622922, 38622923, 38622902, 38622903, 38622904, 38622905, 38622906, 38622907, 38622908, 38622872, 38622881, 38622882, 38622883, 38622884, 38622885, 38622886, 38622887, 38622888, 38622889, 38622890, 38622873, 38622891, 38622874, 38622875, 38622876, 38622877, 38622878, 38622879, 38622880, 38622844, 38622845, 38622935, 38622936, 38622937, 38622938, 38622947, 38622939, 38622940, 38622941, 38622942, 38622943, 38622944, 38622948, 38622925, 38622934, 38622926, 38622927, 38622928, 38622929, 38622930, 38622931, 38622932, 38622933, 38622945, 38622946, 38622952, 38622953, 38622862, 38622838, 38622868, 38622869, 38622870, 38622871, 38622951, 38622853, 38622736, 38622719, 38622827, 38622735, 38622713, 38622714, 38622834, 38622802, 38622741, 38622740, 38622743, 38622742, 38622716, 38622717, 38622712, 38622737, 38622828, 38622723, 38622833, 38622725, 38622726, 38622727, 38622728, 38622729, 38622724, 38622744, 38622832, 38622732, 38622731, 38622738, 38622733, 38622734, 38622718, 38622801, 38622720, 38622769, 38622770, 38622771, 38622772, 38622773, 38622774, 38622775, 38622776, 38622777, 38622786, 38622787, 38622788, 38622789, 38622790, 38622791, 38622792, 38622793, 38622794, 38622795, 38622778, (thoundsands more...)

The raw SQL query response is cached by active record, then the artifacts are serialized. Ruby processes allocate around 100MB to 150MB, exclusively to handle the artifacts. This also increases the CPU time spent on the request. Projects::PipelinesController#index appears frequently in the longest CPU-consuming endpoints

Proposal

I would like to suggest we load the artifact list lazily when the download button is clicked instead.

From my personal perspective, I barely click that download button. When I do, I care about a test failure in a particular pipeline in the page. Other pipelines are out of my radar. So, I think it's reasonable to wait for a fraction of seconds when the artifacts are loading when I click on the button. I believe that other users may share the same workflow, but don't have any metrics to back this claim up.

One more point to back this up is that the artifacts will be added gradually when a test finishes. If a user accesses the page when some tests are still running, they have to reload the page to get the full list artifacts. Lazily loading the artifacts can make sure the artifact list always up-to-date, in-sync with the build status.

We already defer loading the build status to after clicking on the stage icon. So I think it's safe to replicate that behavior.

Edited Mar 29, 2021 by Thao Yeager