Expiration policy dry-run and forced run
Problem to solve
As an Automation Engineer, when defining the Expiration Policy for docker registry I'd like to test my configuration.
I am not sure which persona that goes to, I think it's between Rachel (Release Manager) and Devon (DevOps Engineer) personas. In general, it's for people who manage docker images and automate CI/CD pipelines.
User experience goal
The user should be able to test and verify the expiration policy settings without waiting for another tag expiration loop.
I several proposals here:
in CI/CD settings for a given project, under Container Registry tag expiration policy there could be a button, called "Test expiration policy", with a tooltip "Check which tags would be removed and why", which would run the selection algorithm, and list:
- all images that would be removed,
- (optionally) that would not be removed due to minimum age,
- (optionally) that would not be removed due to the number of retained tags,
- (optionally) that would not be removed due to falling into preserved tags regex.
The results could be shown either as a popup or as a list under the "Test expiration policy" button.
Because the list can be quite long and take some time to execute it should be limited.
An alternative approach to testing expiration policy is to expose an input within the Container Registry tag expiration policy settings that would allow you to test an explicit tag name against the expiration policy. After typing in the tag name, and pressing "Test expiration policy" button, the user should see what will happen to given tag on the next cleanup run and why (e.g. Removed due to matching regex XYZ, Not removed because too young, matching preserved tags, not matching removal regex etc).
Yet another approach would be to expose status "tag" or "pill" or "label" (whatever you call it) in the Container registry UI, next to the image tag name. It would say the same as in the previous point. This might be a performance killer due to being computed on each refresh.
In the Container registry UI, in the "Retention policy has been Enabled" alert, there could be a "run now" button which would ask (in modal) if the user understands that they'll run tag removal, and after confirmation would run the expiration algorithm.
In the Container registry UI, in the "Retention policy has been Enabled" alert, there could also be a "show me which tags will be removed" link which would either show the same results (and in a similar fashion as "Test expiration policy" button from the first proposal) or just take you to the CI/CD settings, to the "Test expiration policy" button.
I just ended a long migration process from GL 11.11.0 to 13.0.X on a private instance and due to our registry growing in size, wanted to start enabling expiration policies.
Unfortunately, some projects have many tags (yes, I am aware of potential performance issues for projects with many tags) and some of those tags are quite important as they serve as production images in our kubernetes cluster. I'd like to be sure, that expiration policies are working the way I expect to avoid data loss.
I configured those settings for a smaller project first, waited a day (which alone is not perfect) and found out that no tags were removed. I have no idea why (probably messed up some settings), but now, once I changed the expiration policy settings, I have to wait another day, again. And probably will have to do that again after. I hate to wait for optimization! It would be cool if I could just know what will be removed, and fire that process to check if everything is as expected and what is the actual performance impact.
For me, personally, it's especially important as the documentation doesn't explain everything in terms of those settings. See #220500 (closed) - I have no idea if I should define multiple images as separate lines, or as regex ors (
|), or in any other way. And I presume that's where I got something wrong.
Also sorry for making this issue so messy - I think it ended up being several proposals to resolve one issue. Let me know if I should clean it up and how.
Permissions and Security
I'd lock it under the same permissions as those that are required to set expiration policy in CI/CD.
What does success look like, and how can we measure that?
If I can fail, test and reiterate expiration policy settings in a span of 5 minutes, then it's a success for me.
What is the type of buyer?
As the expiration policy falls within the free tier, the proposed feature should fall into the same tier.