We should follow up on #32935 (closed) by expanding soft delete to include groups.
Proposal
Require a soft deletion period for group removal.
Attempting to remove a group marks it for deletion. It should no longer immediately remove the group.
After X days have elapsed, the project or group is deleted.
We should use the same soft delete configuration introduced in #32935 (closed).
Attempting to delete something:
The "remove group" container for group should reflect the soft deletion period.
A project pending deletion should be marked for deletion on the group overview page (banner, similar to the archived project notice) and group list (with a pending delete badge).
Attempting to delete a group with the API should also result in the soft deletion period as described above.
Recovering a group pending deletion:
Include a "restore" button in group settings that unarchives the project and removes the soft deletion state.
Should be able to restore/remove the soft deletion state via API.
Soft deletion and restore should trigger an audit event.
Noting down some ideas/questions/edge cases for implementing soft-delete for groups. (not exhaustive)
Right now, we don't have an archive feature for groups, like we have for projects.
I feel that we should split this issue into further issues, so like,
2.1 Implement archive feature for groups, so that the groups can be put to a read-only mode, where we recursively apply archival to it's subgroups and projects as well.
2.2 Implement soft-delete after implementing archive functionality.
Implementing archive first also gives us the advantage of figuring out the places we need to add scopes/validations to prevent showing an archived group in api/UI/search results. (similar behaviour to an archived project) without the whole feature not becoming so over-whelming as I think figuring this out alone is going to be quite a bit of work.
After implementing archive, implement soft-delete for groups.
When soft-deleting a group, move the group to archived state and mark it for deletion (same as we do for projects) and also recursively move it's projects and subgroups to archived state ( marking these for deletion might be problematic because then we'd need to also make sure that deleting_marked_projects_destroy_cron_worker does not pick up these projects for deletion)
We'd also need to make sure that a project or a subgroup under a group that is already marked for deletion does not allow the user to mark these individually for deletion again.
We need to take care of a case where a project, say Project P under a group, say A group is already marked for deletion or is archived. We further mark "A group" for soft-delete on a future date. But we later restore this A group back to active state. Restoring A group back to active would also involve recursively changing its projects and subgroups back to active state - so this would restore Project P back to active state, even though it was archived/marked for deletion separately.
Please feel free to add anything I might have missed. Thoughts @mksionek@jeremy? :)
@manojmj thanks for your very thorough input on this!!
Actually I was talking to @jeremy and what he proposed is to JUST mark group for deletion and then deleting it after N days, without an archived state. We can add it later on. The most important thing is to prevent accidental group deletions and let users restore it. Do you think it is ok to have just soft-delete in place and add an archive state in the next iteration?
Oh, thanks @mksionek, that makes real sense as an MVC!
So,
User marks the group for soft-delete.
There will be no changes to the group, or it's children after marking it for soft-delete (projects inside this group can be pushed to, new sub-groups can be created, new projects can be created, the group will keep appearing in search results - basically it continues acting like a normal group)
The only indication that the user gets about pending deletion is the the notice we will have up on the group page (should we extend the notice to be visible in it's subgroups and its projects as well?)
As per the application setting for adjourned delete, delete the group and its contents recursively on that specific date.
Allow the user to restore the group if they wish to, before it is permanently deleted.
Yes, from the group settings page where existing "delete" option is. This also applies to the API.
There will be no changes to the group, or it's children after marking it for soft-delete (projects inside this group can be pushed to, new sub-groups can be created, new projects can be created, the group will keep appearing in search results - basically it continues acting like a normal group)
Yes
The only indication that the user gets about pending deletion is the the notice we will have up on the group page (should we extend the notice to be visible in it's subgroups and its projects as well?)
The only thing I'm curious about is whether or not we're informing the user enough:
I think it makes sense to add a pending deletion notice to the subgroups and projects. We can use a similar pattern that we are using in projects by putting a message on the overview page.
Would it make sense to display a warning on the command line if we push or pull to a project that's pending deletion? We can consider this in a separate issue if it's significantly more effort, but I think that a developer who rarely signs into the GitLab UI should probably be informed that the project/group is pending deletion.
As per the application setting for adjourned delete, delete the group and its contents recursively on that specific date.
Yes
Allow the user to restore the group if they wish to, before it is permanently deleted.
yes, it does, but I agree that display a warning on the command line if we push or pull to a project that's pending deletion should have a separate issue...
Thanks @jeremy for pitching in. Agree with @mksionek on the above, the command line notice could be a different issue :) Everything else looks ok, I will start work on this once the project soft-deletion is done :)
For project soft-delete feature, @mksionek has already added an application setting deletion_adjourned_period which could be 0-7 days.
Is there a strong reason why we've limited this to a maximum of 7 days? I could see some companies wanting to raise this to a very high value (30, 60 days) to reduce the chance of unwanted data loss.
I am wondering whether the feature should also be one
I agree, I strongly suggest we simply combine these into a single soft deletion period that applies to both.
Is there a strong reason why we've limited this to a maximum of 7 days? I could see some companies wanting to raise this to a very high value (30, 60 days) to reduce the chance of unwanted data loss.
Just to make sure we are on the same page: Please do not change the name in your MRs, you have already gone through a lot of work in splitting them. I will rename them myself when I raise my MRs :)
I implemented showing the pending delete label to group and it's children that are marked for deletion. Here, if the parent group is marked for deletion, all the children will also show this label.
While I feel this is a great indication for users to know which groups are scheduled for deletion, I think that this will prove costly for us in terms of the number of db queries fired because,
For each group/project, we have to keep checking if any of it's parent further up the ancestor chain has been marked for deletion.
Since we do it for each item, it is firing multiple queries.
This is in stark contrast to how @mksionek implemented the same in project soft-delete, since here the marked_for_deletion_at is an attribute of the project itself and hence there is no need for any extra query to figure out whether or not to add the pending delete label.
Also, please not that we could implement the notice in an individual group page and for all of it's subgroups and child projects, as a way of indicating the pending deletion to users. Like so,
In a project page whose parent group has been scheduled for deletion:
In a sub-group page whose parent group has been schedule for deletion:
In a group page which has been scheduled for deletion:
Do you think we can make do without the "recursive" pending delete label in case of groups page, at least for the first iteration and simply show pending delete only for the specific group being deleted?
@manojmj: I definitely see your point. Thanks for the great explanation and screenshots. I agree that we can do without the recursive pending delete label on the group tree page. Let's show pending delete for the specific group being deleted.
Also, please note that we could implement the notice in an individual group page and for all of it's subgroups and child projects, as a way of indicating the pending deletion to users. Like so:
This is great. I do think it's more important for a user who is viewing a group, subgroup, or project to be aware that the object they're looking at is going to be deleted. I assume we don't have similar performance issues since we're not firing off multiple queries for the group tree, but only one query to span a linked list (or similar) for a parent group or subgroup that's pending deletion.
Just checking in since this has a frontend label, @manojmj, but it looks like you're able to replicate the UI changes from #32935 (closed), therefore you won't need support from an FE besides review?
@manojmj, I just noticed that this issue didn't have the GitLab Premium label on it. Please note that we should follow #32935 (closed) which was introduced in Premium. Sorry for any confusion there.
The core Terraform and Pulumi resource life-cycles are strictly CRUD. But Terraform/Pulumi expects that a delete is a real delete. The resource should no longer exist after you remove the code for the resource (e.g. a group). So based on the tier now, you get a different outcome.
This has an impact for all users of Terraform and Pulumi working against Gitlab with the respective providers.
I filed an issue in the terraform-provider-gitlab project. Please continue the discussion there!
Thanks for highlighting this, @ringods. Do you or @nagyv-gitlab have a proposal in mind? Soft deletion is behavior we'd like to retain since it can prevent accidentally deleting important projects.
The next step we planned was implementing a recycle bin at some point in the future, but I don't think this will solve the described problem. I propose that we introduce an API param for removals that isn't default behavior, but will immediately remove the project.
@lmcandrew: Please chime in if you've got any suggestions as well.
@jeremy Immediate removal is clearly an option. At the same time, that would miss the value of soft-delete.
Another approach would be an automatic API-level restore when someone tries to create a project / group that is pending deletion.
Given that my project X is pending deletionWhen I create a project with name XThen my previous project is restoredAnd it's noted that instead of a new resource, a previous resource was restored.
I don't know which one is preferred. We should ask some heavy API or Terraform users.
I feel like automatically bringing the repo back is a bit of hidden magic and wouldn't entirely make sense with regards to Terraform.
Personally my expectation when using Terraform: If I delete something it is gone. Terraform wont bring it back. If the product supports some kind of "restore" I would expect to do that through the web UI or CLI, not with Terraform, and then I would import it back into my Terraform state.
This is how it works in GCP if I delete a project. It goes into "pending deletion" and is gone from the API. I can go through the web interface, restore the project, and then import it into Terraform. I believe, it has been a while since I did that but thats how I remember it.
@nagyv-gitlab I am in line with how @regnerba approaches this: do not magically restore a group or a project which is pending deletion just because the name matches. In a normal Terraform or Pulumi usage cycle, it is not what a regular user would expect.
If there is a group or project with name X and marked for deletion, please let a create of a project or group using the name X fail with the error message that the object still exists in state pending delete. This would be much clearer for TF/Pulumi users.
Both Pulumi and TF providers make use of the go-gitlab library. If a group or project is deleted, and it becomes pending delete, is this result clear in the API response?
In the Github repository for go-gitlab, I also created a ticket regarding the soft deletion of groups and projects:
Could someone from Gitlab help us out there to get the client library correct? Once the soft-deletion is in place there, the TF and Pulumi providers will be updated quite fast after that.
Hi, author of go-gitlab here. Not sure if I follow what you are asking @ringods? Soft-delete is (as far as I understand it) not something we can implement as the API call for doing a delete now effectively has become a soft-delete.
As mentioned in the issue you created (783), the best (I think) we can do is add a pending_delete field in the related structs so that you have a way of telling that the resource is soft-deleted.
Once that is added, the TF provider should be extended to (for example) first try to create the resource and when a name collision error (assuming a specific error is returned in this case?) is received, it could try to fetch the resource and check the pending_deletion field to determine what kind of error it should return to user.
So not sure what you are asking from Gitlab in this case, unless some of my assumptions are wrong (I'm not using Gitlab much the last few years, so I could easily be making some bad assumptions here).
@svanharmelen what I am asking from Gitlab is to point us to the correct and final response structure of a DELETE on a project so the go-gitlab project could correctly say that the project is deleted.
@jeremy is this feedback enough for you about the API-level handling of pending deletes or are there open questions you'd like to answer? Is there anything else I can help with?
@manojmj@jeremy has anything changed regarding soft-delete of groups in Gitlab v13?
Since around the release of v13, we have had test failures in the Terraform Gitlab provider regarding deletion of groups:
=== RUN TestAccGitlabGroup_disappears##[error] TestAccGitlabGroup_disappears: testing.go:654: Step 0 error: Expected a non-empty plan, but got an empty plan!
I checked on this, and it appears that no recent changes have been made to group-soft delete.
I noticed the test failures and it appears that the EE tests pass and only the CE ones fail.
Group soft deletion is a licensed feature and marked_for_deletion_on attribute itself appears in the Group API response only if the GitLab installation has a license with GitLab Premium and above. Which means, in CE, this attribute would not exposed in the API response. Could this be the reason the test is failing in CE? (just thinking out loud)
My guess is that Gitlab group deletion requests are added to a task queue, and that the API 202 Accepted response merely acknowledges that this deletion order will be treated in the near future.
This was seen in Gitlab CE, so as you pointed out, this is not related to Gitlab EE soft-delete feature.
In CE, groups are "instantly" deleted - but this deletion happens via a background job.
In CE:
On hitting the DELETE API for a group:
Group is queued for deletion.
Respond with 202
In EE:
On hitting the DELETE API for a group:
Group is marked for deletion at a later date.
Respond with 202
So, in CE, after the DELETE API is hit and before the deletion job is actually picked up by the background worker, there is a window where group is still active and you'll get a 200 OK if you hit the GET /groups/:id API for that group during that window.
I think the workaround PR that you submitted is right. There should be a wait period before the GET API is hit again, after a DELETE has been issued.
I've also opened #219728 to try and discuss an approach that will help us to avoid adding this "wait time" between API calls by having an immediate acknowledgement of the deletion.