CI jobs fail when namespace is deleted in Kubernetes

Summary

It seems like GitLab struggles to work with jobs for environments which had their namespace deleted from Kubernetes. We had some issues deploying an app and decided to start from scratch by removing the namespace, cleaning up all unused resources with it. This turned out to be the wrong decision.

In the troubleshooting section I found that GitLab creates a namespace and service account. I am assuming these are stored somewhere and reused for subsequent jobs.

As the namespace was removed but GitLab still believes it to be there, these jobs fail. We're seeing errors like:

error: unable to recognize "STDIN": Unauthorized

Our workaround is to load our own Kubernetes config. At the expense of the intended security your implementation is giving since 11.5.

Steps to reproduce

  • Set up Kubernetes
  • Run a job of an environment and interact with Kubernetes
  • Remove the namespace it has created
  • Run the job again

What is the current bug behavior?

The job fails.

error: unable to recognize "STDIN": Unauthorized

What is the expected correct behavior?

I would expect it to recreate the namespace and service accounts when the complete namespace is missing. To be clear: from my perspective it would be a different issue if only the service accounts were missing.

Relevant logs and/or screenshots

I am not seeing any entries being added to kubernetes.log when the job fails.

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info

System information System: Proxy: no Current User: git Using RVM: no Ruby Version: 2.6.3p62 Gem Version: 2.7.9 Bundler Version:1.17.3 Rake Version: 12.3.2 Redis Version: 3.2.12 Git Version: 2.22.0 Sidekiq Version:5.2.7 Go Version: go1.11.5 linux/amd64

GitLab information Version: 12.2.1-ee Revision: e4a8b6c773a Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: PostgreSQL DB Version: 10.9 URL: https://git.url HTTP Clone URL: https://git.url/some-group/some-project.git SSH Clone URL: git@git.url:some-group/some-project.git Elasticsearch: no Geo: no Using LDAP: no Using Omniauth: yes Omniauth Providers: google_oauth2

GitLab Shell Version: 9.3.0 Repository storage paths:

  • default: /var/opt/gitlab/git-data/repositories GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell Git: /opt/gitlab/embedded/bin/git

Results of GitLab application Check

Expand for output related to the GitLab application check

Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 9.3.0 ? ... OK (9.3.0) Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Redis available via internal API: OK

Access to /var/opt/gitlab/.ssh/authorized_keys: OK gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes Number of Sidekiq processes ... 1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab App ...

Git configured correctly? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... yes Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) Projects have namespace: ... 7/5 ... yes 16/6 ... yes 16/9 ... yes 16/10 ... yes 10/11 ... yes 10/12 ... yes 16/15 ... yes 16/16 ... yes 16/20 ... yes 16/21 ... yes 16/22 ... yes 16/24 ... yes 16/25 ... yes 16/26 ... yes 16/30 ... yes 16/31 ... yes 16/32 ... yes 16/33 ... yes 16/34 ... yes 7/36 ... yes 16/39 ... yes 16/40 ... yes 17/41 ... yes 16/42 ... yes 16/43 ... yes 16/44 ... yes 16/45 ... yes 16/46 ... yes 7/47 ... yes 7/48 ... yes 7/49 ... yes 7/50 ... yes 27/55 ... yes 7/56 ... yes 27/57 ... yes 27/58 ... yes 27/62 ... yes 7/63 ... yes 16/64 ... yes 27/65 ... yes 27/66 ... yes 27/67 ... yes 7/68 ... yes 27/69 ... yes 27/70 ... yes 27/71 ... yes 28/72 ... yes 28/73 ... yes 28/74 ... yes 28/75 ... yes 28/76 ... yes 28/77 ... yes 28/78 ... yes 28/79 ... yes 28/80 ... yes 28/81 ... yes 28/82 ... yes 28/83 ... yes 28/84 ... yes 28/85 ... yes 28/86 ... yes 28/87 ... yes 28/88 ... yes 28/89 ... yes 28/90 ... yes 28/91 ... yes 28/92 ... yes 28/93 ... yes 28/94 ... yes 28/95 ... yes 28/96 ... yes 28/97 ... yes 28/98 ... yes 28/99 ... yes 28/100 ... yes 7/102 ... yes 16/103 ... yes 16/104 ... yes 7/105 ... yes 7/107 ... yes 16/108 ... yes 7/109 ... yes 16/111 ... yes 16/112 ... yes 7/114 ... yes 7/115 ... yes 7/116 ... yes 7/117 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.5.3 ? ... yes (2.6.3) Git version >= 2.22.0 ? ... yes (2.22.0) Git user has default SSH configuration? ... yes Active users: ... 4 Elasticsearch version 5.6 - 6.x? ... skipped (elasticsearch is disabled)

Checking GitLab App ... Finished

Checking GitLab subtasks ... Finished

Possible fixes

Assignee Loading
Time tracking Loading