Skip to content

Review app: nginx-ingress-controller in crash loop

Continuing from slack thread: https://gitlab.slack.com/archives/CMA7DQJRX/p1622102289075600

Summary:

  • nginx-ingress-controller pods using image k8s.gcr.io/ingress-nginx/controller:v0.41.2 goes into crash loop backoff

Observations:

Slack discussion

Andrey 1 hour ago newly deployed review app do seem to be in a bad shape though, I see a lot of failures to even open the login page

Andrey 1 hour ago https://gitlab.com/gitlab-org/gitlab/-/jobs/1297424294 <- even performance job did not execute

Andrey 1 hour ago qa-smoke is failing as well

alberts 1 hour ago https://gitlab.com/gitlab-org/gitlab/-/jobs/1297424294 <- even performance job did not execute This release name: review-allure-rep-fdpkkf It seems that nginx ingress is in a crash loop for another reason https://cloudlogging.app.goo.gl/2NcdSEFvNG2EKVH19 (edited)

accounts.google.comaccounts.google.com Google Cloud Platform Google Cloud Platform lets you build, deploy, and scale applications, websites, and services on the same infrastructure as Google.

alberts 1 hour ago @rymai any ideas?

alberts 1 hour ago This one is freshly installed, so nothing to do with helm upgrade

alberts 41 minutes ago There’s a new container called “controller”, instead of “nginx-ingress-controller” which has been running close to CPU limit Screenshot 2021-05-27 at 4.35.34 PM.png Screenshot 2021-05-27 at 4.35.34 PM.png

remy 41 minutes ago Yeah I’ve noticed that the nginx-controller pods hit the CPU limit: https://console.cloud.google.com/kubernetes/pod/us-central1-b/review-apps/review-apps/review-allure-rep-fdpkkf-nginx-ingress-controller-b68c5966k69dq/details?project=gitlab-review-apps accounts.google.comaccounts.google.com Google Cloud Platform Google Cloud Platform lets you build, deploy, and scale applications, websites, and services on the same infrastructure as Google.

remy 40 minutes ago Screen Shot 2021-05-27 at 10.36.47.png Screen Shot 2021-05-27 at 10.36.47.png

alberts 39 minutes ago I’m looking at the container name

alberts 39 minutes ago I wonder if it changed, and our base_config wasnt updated

remy 38 minutes ago 2021-05-27T08:32:09.923577Z "Configuration changes detected, backend reload required" I 2021-05-27T08:33:50.929615Z "Backend successfully reloaded" I 2021-05-27T08:33:50.929753Z "Initial sync, sleeping for 1 second" I 2021-05-27T08:33:50.929875Z Event(v1.ObjectReference{Kind:"Pod", Namespace:"review-apps", Name:"review-allure-rep-fdpkkf-nginx-ingress-controller-b68c5966k69dq", UID:"43fc491b-964c-4b01-b9f6-7e4c708ad48e", APIVersion:"v1", ResourceVersion:"806299827", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration I 2021-05-27T08:34:12.021492Z "Received SIGTERM, shutting down" I 2021-05-27T08:34:12.021877Z "Shutting down controller queues" I 2021-05-27T08:34:12.164601Z "Stopping NGINX process" I 2021-05-27T08:34:12.429563395Z 2021/05/27 08:34:12 [notice] 95#95: signal process started E 2021-05-27T08:34:13.525489Z "NGINX process has stopped" I 2021-05-27T08:34:13.525525Z "Handled quit, awaiting Pod deletion" I 2021-05-27T08:34:23.525709Z "Exiting" code=0 I

remy 37 minutes ago It still seems to be nginx-ingress.controller: https://docs.gitlab.com/charts/charts/globals.html 👍 1

alberts 37 minutes ago This new “controller” started appearing around the same time we changed to new chart version

alberts 31 minutes ago Increase the CPU request first? (edited)

alberts 30 minutes ago I looked through the logs for the nginx-ingress-controller pod https://cloudlogging.app.goo.gl/AxBqmHfxJAK3sxTj6, nothing seems amiss. It finds the matching ingress class

accounts.google.comaccounts.google.com Google Cloud Platform Google Cloud Platform lets you build, deploy, and scale applications, websites, and services on the same infrastructure as Google.

remy 29 minutes ago I found which updates the NGINX chart: gitlab-org/charts/gitlab!1690 (diffs)

alberts 28 minutes ago I also found this change, but didn’t see anything different in the values.yml. Maybe I missed something New

remy 27 minutes ago Yeah I don’t see anything different either. 🤔

alberts 24 minutes ago !62372 (merged) In this MR that upgraded the chart, the review app page could load. https://gitlab-review-331577-rev-527cns.gitlab-review.app/users/sign_in, so it’s not the chart itself 🤔

Edited by Albert