Keep latest artifacts for the last successful jobs
Problem to Solve
The expire_in
function is great, but depending on how often your pipeline is built can be hard to tune. If you don't build frequently, setting a short expire_in
can result in your most recent artifact being deleted. Setting it too long can keep unneeded copies around. Over time this can use up a lot of disk space. Most of the time, people just want to ensure they always at least have the most recent artifact, whenever that was built.
Proposal
Implement a default behavior for all projects to always keep the latest artifact for each successful job that was part of a successful pipeline (that produced artifacts) on any active (non-deleted) branch, merge request, or tag (i.e., the one downloaded via https://docs.gitlab.com/ce/user/project/pipelines/job_artifacts.html#downloading-the-latest-artifacts is exempt from any expire_in
policy). In other words, never delete the most recent artifact (even if expired), which is same artifact as described at the docs link here:
Additionally:
- Artifacts for deleted branches follow
expire_in
- All non-latest artifacts follow
expire_in
- Only the latest artifact for a job on a non-deleted branch is kept
This would allow for setting a relatively aggressive expiration policy, while remaining confident you always have your most recent artifact for any active branches.
Implementation notes
Because expire_in:
could be set too short we would need to define a way to "lock" the latest artifacts to prevent deletion by ExpireBuildArtifactsWorker
. For example add a new column job_artifacts.locked
(false by default). Then when a pipeline completes we will lock the latest artifacts in a moving window: lock last ones and unlock anything from previous pipelines. Finally the ExpireBuildArtifactsWorker
would need to be modified to filter out locked artifacts.
Another aspect to consider is to unlock artifacts when a branch is deleted or they would remain locked and never deleted.
Out of Scope for this MVC
Items not expected for this MVC issue:
- ability to keep latest N count of artifacts per job
- new checkbox in project's
Settings > CI/CD > General pipelines
forKeep latest job artifacts per ref
- allow artifacts to live longer than N days for compliance reasons
- aggressively remove artifacts which are not the latest even before they expire