Delete K8s Cluster and resources fails if cluster is not reachable
Deleting a cluster integration can be done in one of two ways - "Remove integration" and "Remove integration and resources." If a user tries to delete an integration and its resources, but the cluster is not reachable (i.e. it's been deleted), the integration remains in GitLab with no explanation as to why.
Steps to reproduce
- Create a managed k8s cluster using either GKS or EKS
- Configure it as an instance level cluster
- Ensure it creates and gets picked up by GitLab
- Delete only the cluster itself in the cloud environment where it was created
- Try to use GitLab's "Remove integration and resources" function to remove the cluster
At this point, the removal will start and appear to be in progress. We have an open issue on improving UI around this process but I'm not sure the specific case I'm describing is included in that. The integration will never be removed properly, and it can seem unclear as to why.
To remove the integration, you can use the "Remove integration" option (i.e. not associated resources), so there is a viable workaround to actually remove the integration, but the lack of errors and warnings can be really confusing.
An example of a premium customer who ran into this and wasn't sure what to do: https://gitlab.zendesk.com/agent/tickets/151569 (internal link). Took me a bit to understand the issue, even having seen the same behavior on my own instance.
I don't have an example of this at the moment, but I can help re-produce to show the behavior if needed. Let me know if a screen recording would be helpful and I can make one.
What is the current bug behavior?
The cluster integration is never removed
What is the expected correct behavior?
There are two possibilities:
- At minimum, an error message should be displayed saying that the cluster is not reachable, so the associated resources cannot be deleted. It should also provide next steps:
- Use the "Remove integration" option
- If the cluster has not been deleted, use
kubectlto manually remove the created resources
- We could do whatever cleanup we can. This might be something like displaying a message saying that the cluster is not reachable, and providing a direct option to remove the integration only, but make it clear that if the cluster is still running then the resources will not be removed and the user will need to handle that themselves.
Relevant logs and/or screenshots
Results of GitLab environment info
Results of GitLab application Check
(If you can, link to the line of code that might be responsible for the problem)