Skip to content

Quickly resolve issues with your Cleanup policy with improved validation and notifications

Problem to solve

Administrators would like to use the GitLab Cleanup policy for tags at the Project level so that they can programmatically identify which tags should be removed or retained. To do so, they define an interval, schedule, and use regular expression to define a tag name to remove/retain.

When an expiration policy has failed due to an invalid regular expression, you need to be notified, so that you can fix the issue as quickly as possible.

Intended users

Proposal

When a Project's Cleanup Policy has failed to run, we will notify the Project's Owner/Admin with a helpful error message via email and the UI.

Email notification

Subject: Cleanup policy has failed for project

Body:

  • Project
  • Policy
  • Error
  • Link to documentation on acceptable regex

User experience

The UI shows the alert message and a highlighted field w/ a specific error message when they land on their Project's CI/CD settings page (where the Tag Cleanup Policy is located)

Add_Alert_to_the_settings_page

Add_Alert_on_Container_Registry_Homepage

User experience goal
  • When a policy fails to run due to a regular expression issue, the user is notified that the job failed, why, and how to fix it.
UX Questions
  • Who should see these errors?
    • @icamacho I think we should only show the error to those that have the power to fix it. I believe this is Project Owner/Admin.
  • What copy should we use for the email/UI?

Further details

Technical considerations

The execution error happens in a worker and this process is in the background = not connected to the UI. What we could do is have the worker save the error message for the container expiration policy and when the user visits the UI the error message is displayed.

Permissions and Security

  • There are no permissions changes required for this change

Documentation

Availability & Testing

What does success look like, and how can we measure that?

Success looks like we see a higher success ratio of policies successfully run/failed to run and that Admin can rely on the feature to work for their project.

Metrics

  • We will measure this by looking at the overall adoption of the feature
  • @10io is it possible to track the number of jobs that succeeded/failed?

Links / references

Edited by Iain Camacho