Skip to content

Review apps hitting deployment limit

Incident description

On May 22, the Handbook and internal Handbook hit the Namespace reached its allowed limit of 500 extra deployments error in the pages_deploy job. This caused all MR pipelines on these projects to fail.

The incident had been related to the use of the experimental Pages Multiple deployments feature. By using this feature, the projects were configured so that all MRs have a separate, "versioned" (aka "prefixed") Pages Deployment to function as a review app.

Cause

The number of versioned/prefixed pages deployment is limited to 500 per Namespace, in this case the gitlab-com/content-sites group.

The investigation found out that the Pages Deployments of the MRs were not deleted when the MR was closed or merged, accumulating until the limit was reached.

Further investigation by @janis determined the root cause was that the call to Pages::DeactivateMrDeploymentsWorker in the MR model's base_service included the MR instance instead of the MR ID as was expected.

Resolution

@janis created gitlab-org/gitlab!153965 (merged) to fix the underlying issue.

To delete the excess pages deployments whose MRs were already merged required a one-time manual intervention, since the Worker would not catch already-closed MRs. To facilitate this, @janis prioritised an already planned MR that adds a mutation to allow users to delete Pages Deployments: gitlab-org/gitlab!153981 (merged). This was subsequently used by @leetickett-gitlab to delete the orphaned Pages deployments.

Edited by Janis Altherr
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information