Archive old deployments (Delete old deployment refs)
Release notes
We have added automated archiving of old deployments to maintain high performance for git commands and to clean up project records. GitLab will keep the recent 50,000 deployments per project and archive the others. Specifically, the associated git-ref will be deleted.
Problem
When a deployment job runs (i.e. job with environment keyword), a corresponding deployment entry is created, and it subsequently creates a deployment ref in the format of refs/remotes/origin/environments/*. Since these git-refs are advertised for git-checkout, this often causes a performance issue, especially slowing down git commands on gitlab.com. This happens not only on www-gitlab-com, but also a customer project.
While we're preparing an automated mechanizm to clean up non-critical environments, this problem still stands for critical environments as well. For example, &5920 (comment 633731178) states that 446,000+ of deployment refs exist on production environment. We need a further cleanup-automation for old deployments on critical environments without affecting the traceability of the deployment history.
Proposal
We keep the recent 50,000 deployments per project and archive the rest. Specifically,
- We add
archivedboolean flag todeploymentstable. Default isfalse. - We set
archived=trueto deployment records that created before recent 50,000 deployments. - An archived deployment deletes a corresponding git-ref e.g.
refs/remotes/origin/environments/*.
NOTE:
- Users still can fetch the deployed source code with
git checkout <commit-sha>even after archive. - GitLab can restore the deleted deployment refs by running
deployment.created_ref. If there is an unexpected negative consequence on this change, we can restore the refs by runningDeployment.archived.each(&:create_ref). - Consider this is a similar feature with Archive Jobs.
The difference from Delete stopped environments proposal
| header | This Issue | Delete stopped environments proposal |
|---|---|---|
| Deployment to act on | All deployments older than recent 50,000 deployments | Deployments in Auto-Stopped environments older than 1 month |
| Environment to act on | Long-live environments (e.g. production) and Short-live Environments (e.g. Review App) | Short-live Environments (e.g. Review App) |
| What to remove | Git Ref | Git Ref and Database Record |
| Recoverable | Yes (See note above) | No |
Questions
Do we know how many deployment refs per project that we can start noticing degraded performance in git operations?
It's up to how the GitLab instance is scalable (e.g. HA is enabled or not). Here are some data points:
- A project on gitlab.com encountered this issue with 400k deployments.
- A project on customer's on-premises instance encountered this issue with 80k deployments.
What happens if a deployment is marked archived?
When a deployment is archived, the corresponding git-ref is deleted, other than that nothing is changed.
Does removing the ref cause any unexpected user interface issues?
One of the potential use case of deployment refs is deploying from an external server. For example,
- A user runs a pipeline and create a deployment record.
- Triggers a deployment webhook to notify an external system.
- The external system fetches deployment information via API and checking out a specific deployment ref.
So as long as the recent deployment refs stay intact, users won't encounter a problem.
Also, if the above assumption is wrong, we can restore the archived refs.
Further iteration
- If an archived deployment should be read-only records so that we can prevent users from accidentally pushing the retry/rollback buttons.