Review apps cleanup is not complete
With the introduction of EE review apps and the further improvements through https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/6665 and https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22019 , we do cleanup better and more frequently.
While working on https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/7587/diffs , I've observed that cleanup is not working reliably.
For example, this job cleared out a number of deployments from the end of September.
However, if I go and run the helm command manually, there are deployments that are still not fully cleaned up:
helm list --namespace "review-apps-ee" --tiller-namespace 'review-apps-ee' --failed --date --max 10
NAME REVISION UPDATED STATUS CHART NAMESPACE
review-ce-to-ee-2-c7hk11 1 Wed Sep 26 04:20:12 2018 FAILED gitlab-1.0.2 review-apps-ee
review-ee-dz-migr-4t399b 1 Wed Sep 26 11:36:02 2018 FAILED gitlab-1.0.2 review-apps-ee
review-weight-qui-3862vk 1 Wed Sep 26 16:13:03 2018 FAILED gitlab-1.0.2 review-apps-ee
review-ee-41922-s-6xsw9j 1 Wed Sep 26 22:45:41 2018 FAILED gitlab-1.0.2 review-apps-ee
review-ee-ide-ref-urbsjb 2 Thu Sep 27 06:42:52 2018 FAILED gitlab-1.0.2 review-apps-ee
review-6983-promo-w2edis 1 Thu Sep 27 06:53:17 2018 FAILED gitlab-1.0.2 review-apps-ee
review-5382-appro-5jtby0 1 Thu Sep 27 09:17:39 2018 FAILED gitlab-1.0.2 review-apps-ee
review-7495-liste-2e9bxa 1 Thu Sep 27 09:19:23 2018 FAILED gitlab-1.0.2 review-apps-ee
review-7308-renam-femsld 1 Thu Sep 27 09:55:22 2018 FAILED gitlab-1.0.2 review-apps-ee
review-qa-257-gro-emjkj3 1 Thu Sep 27 11:09:35 2018 FAILED gitlab-1.0.2 review-apps-ee
If I do the following:
helm list --short --namespace "review-apps-ee" --tiller-namespace 'review-apps-ee' --failed --date --max 10 | xargs -L1 helm delete --purge --tiller-namespace 'review-apps-ee'
more deploys get cleaned up properly but I've also observed that the Tiller pod often just dies due to the load.
We need to look into why our cleanup job doesn't clean this correctly.