Investigation: What are the latest artifacts and how are they retained or expired?

Summary

We have a number of issues where customers have expressed a less-than-desirable behavior of artifacts not being cleaned up. It's unclear from our documentation how we determine:

  1. the latest artifacts. Per our docs:

Pipeline artifacts from:

The latest pipeline are kept forever. Pipelines superseded by a newer pipeline are deleted seven days after their creation date.

However, it is unclear how we determine the definition of latest.

  1. what we retain permanently v. expire and allow for deletion (see Storage Leak - Artifacts get *NEVER* deleted on... (#326174) - which shows an example where we explicitly state artifacts cannot be removed).

Investigation Goals

  • We should be able to tell customers explicitly how we define the term "latest pipeline"
  • Follow up issues (i.e. problem/solution validation) to determine if our definition of "latest pipeline" enables the best customer outcomes based on real use cases.

Answer the following questions:

  • Does each uniquely tagged pipeline get a unique "latest pipeline"?
  • Are all scheduled pipelines considered "latest pipelines"?
  • What attributes of a pipeline make it a "latest pipeline" (ref, tag, triggerer, etc)
  • Are users able to delete artifacts of a latest pipeline manually (via the API, artifacts management, etc)?
  • Does it create a new ref each time? Given an MR from feature-a branch to main branch, does it mean:
    • Each merged result pipeline on the MR has its own ref: e.g feature-a-to-main-0, feature-a-to-main-1, feature-a-to-main-2 and so on.
    • Or there is a unique ref for the MR. Each merged result pipeline on the MR runs on the ref feature-a-to-main
    • Or maybe something else is happening?
  • When and how are the Merged Result Pipelines ref deleted afterwards? After the pipeline complete or after the MR is merged?
  • When a branch is deleted, when and how are the job artifacts from pipelines on this branch deleted?
  • When a tag is deleted, when and how are the job artifacts from pipelines on this branch deleted?

Related Issues

@jocelynjane to continue to populate this list.

  • Job artifacts from previous pipelines are not unlocked when pipeline fails. #266958 (comment 1200913669)

Related code

  • Ci::Ref state transitions
  • Ci::PipelineSuccessUnlockArtifactsWorker
  • Ci::UnlockArtifactsService
Edited Dec 15, 2022 by Albert
Assignee Loading
Time tracking Loading