Investigate: Allow renaming projects with container registry data
Context
In #18383 (closed), it was originally raised that project AND group updates were not possible if there were images in the container registry associated with these.
This issue is being scheduled to identify a fix and iterative plan to resolve #18383 (closed). Given the wide scope of this issue, we'll start by focusing on project rename operations. Although out of the scope of this issue, project transfer operations should follow the same principle/implementation.
Problem
The inability to rename projects with registry data has been a known limitation of the GitLab Container Registry since its MVC release in %8.8. Right now, attempting to rename a project that has non-empty (at least one tag) container repositories leads to an error.
The current workaround is to delete all tags and only then the project can be renamed. As expected, this is far from being user-friendly and efficient.
Steps to reproduce
- Create a project. We'll assume the project's full path is
my-group/my-project
; - Tag and push an image to
registry.domain.com/my-group/my-project
; - Attempt to rename the project. This fails with the message
Cannot rename project because it contains container registry tags!
Current Implementation
This is the sequence of main actions on the Rails side whenever a user attempts to rename a project with registry data:
ProjectsController#update
Projects::UpdateService#execute
Projects::UpdateService#validate!
-
Projects::UpdateService#renaming_project_with_container_registry_tags?
. Here Rails loops over all known container repositories and queries the registry (using the OCI tags list operation) for each, in order to determine if any have tags. If so, the update is aborted and the messageCannot rename project because it contains container registry tags!
is shown to users.
If no registry data was found in (4), the execution would proceed as follows:
Projects::UpdateService#after_update
Projects::AfterRenameService#execute
-
Projects::AfterRenameService#first_ensure_no_registry_tags_are_present
. Here the validation from (4) above is repeated and the rename fails with messageProject #{full_path_before} cannot be renamed because images are present in its container registry
if there are any tags left.
Required Changes
Within this section, we'll assume a sample scenario where a user wants to rename the project old-name
with full path my-group/old-name
to new-name
. Under my-group/old-name
, we have five container repositories:
-
A
:my-group/old-name
-
B
:my-group/old-name/sub-1
-
C
:my-group/old-name/sub-1/sub-1-1
-
D
:my-group/old-name/sub-2
-
E
:my-group/old-name/sub-2/sub-2-1/sub-2-1-1
Update Operation
There is currently no support for renaming a repository on the registry side. On its database, a repository has path
and name
attributes, where name
is the last path segment of path
. For example, for A
, name
is old-name
and path
is my-group/old-name
. For B
, name
is sub-1
and path
is my-group/old-name/sub-1
.
So, we have to update the name
and path
for the base repository (if exist) - A
, and the path
for all sub-repositories (if any) - B
, C
, D
and E
.
The most obvious option would probably be looping over the project's container repositories on the Rails side (looking at the container_repositories
table) and sending an update request to the registry for each. However, besides increased latency, this approach poses two main problems:
-
Integrity: If a non-transient failure occurs while updating
C
, we'd be left with a broken state whereA
andB
have been renamed but notC
,D
, andE
. While we could have some elaborated healing mechanisms, things can go wrong in many ways in between. For example, users push/pull images to/from the wrong repositories. -
Security: We can't guarantee that an unknown bug won't lead to repositories on the registry side that are unknown to Rails. In fact, we know this happened in the past (#217702 (closed)). For example, imagine that another sub-repository
my-group/old-name/sub-3
exists on the registry but is unknown to Rails. If:- We don't update the path of this repository;
- The project had sensitive data on the registry, so not all
my-group
members had access to it; - The project rename from
my-group/old-name
tomy-group/new-name
goes through; - A user
X
that had no access tomy-group/old-name
creates a new project, using the samemy-group/old-name
name; - User
X
gains access to the registry data that was left behind onmy-group/old-name/sub-3
💥
This only gets worse if we think about project transfers between groups or even namespaces
🙈
For the reasons above, we'd be better off doing an atomic bulk update on the registry side for the base repository and all (that exist, not just known by Rails) sub-repositories under it.
To make this happen we'll need a new PATCH /gitlab/v1/repositories/<path>/
API operation. This operation will accept changes to repository attributes, starting with their name
. This should be called by Rails to update the name
of a project's base repository. Internally, it should also update the path
of the base repository and all its sub-repositories (if any).
* Before attempting the update, the operation should start by ensuring that no repository with path my-group/new-name
OR my-group/new-name/%
exist. Otherwise, a 409 Conflict
should be returned. path
has a unique index on the database, so, even if a repository was created between checking if it exists and the actual rename, a failure would occur and cause the transaction to be aborted. Nevertheless, we should avoid empty updates whenever possible (as those drive database bloat upwards), and check before.
For scalability and performance reasons, this feature will start by being limited to projects with no more than 1000 container repositories. For GitLab.com, this covers 99.98% of all projects (source). We can then increase this later based on metrics and pending a decision in https://gitlab.com/gitlab-org/gitlab/-/issues/357014 (internal). Attempting to update more than 1000 repositories (base and sub-repositories) should yield a 422 Unprocessable Entity
response.
I don't see the need to perform updates on container_repositories
on the Rails side because we only record the name
of repositories (last part of the path segment) and it is set to ""
(empty) for base repositories. The full path of a repository is assembled in real-time based on the project's full path and the repository name
.
Pre-Validation
As explained above, the current Rails implementation is split into two main parts: 1) pre-validation to ensure that the rename is technically possible and 2) post-rename changes.
Ideally, the devised solution would guarantee the following:
- We can check that the target repositories name/path is not yet taken on the registry prior to performing the rename. This would be part of the Rails pre-validation steps;
- We can prevent writes to origin and target repositories between the pre-validation steps and the rename completion. This is to avoid race conditions and prevent data consistency/integrity issues.
For (1), we can do so by providing a "dry-run" option on the same PATCH /gitlab/v1/repositories/<path>/
API operation. Having this validation functionality as a separate operation could mean not doing it before the actual update as initially described in *.
For (2), we could stop providing JWT tokens with write permissions on the Rails side for all known repositories associated with the project being renamed. However, this would not include repositories unknown to Rails, and clients that obtained a token before this lock period started could still use it for writes against the repositories before the token expired (15 minutes for GitLab.com).
A more robust option for (2) means acquiring a special "rename lease" on the registry side before every write operation. In practice this would imply the following:
-
When processing the dry-run rename request, the registry would acquire a rename lease for paths that match
^my-group\/(old-name|new-name)($|\/.*$)
. This lease would have a short TTL ofN
seconds after which writes against a matching repository would be no longer blocked. If the dry-run succeeds (the rename is technically possible), Rails would be informed in the response that it hadN
seconds to complete the operation. If the dry-run did not succeed, the lease would be released ahead of time. -
When receiving the non-dry-run rename request, the registry would ensure the lease is still alive and enough time remains before it expires (e.g., it won't expire before the configured DB transaction timeout of 10 seconds). If so, the rename proceeds, otherwise it's halted and Rails should get back to square one. Leases would be released at the end of a request, regardless of the result.
-
Across the whole registry API, a write request against a given repository (be that a blob upload, a tag delete, etc.) would only be allowed if a rename lease for the target repository does not exist.
On the Rails side, project owners/maintainers would be informed in the UI that renaming a project causes all its container repositories to be locked against writes for the whole duration of the operation.
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.