CertManager does not register Certificates when using OpenShift
Summary
Translated from #1036 (comment 1298153199)
From @bartzhang
I am also having issues with using cert-manager in tandem with the operator, using it as recommended in the helm chart TLS configuration options here: https://docs.gitlab.com/charts/installation/tls.html
- Option 1a) Internal cert-manager and Issuer
- Option 2) Use your own wildcard certificate
- Option 4) Auto-generated self-signed wildcard certificate
Here's what I experienced with Option 1a)
kubectl logs deployment/gitlab-controller-manager -c manager -f -n gitlab-system
2023/02/28 05:08:13 http: TLS handshake error from 10.128.0.7:48996: remote error: tls: bad certificate
2023/02/28 05:08:13 http: TLS handshake error from 10.128.0.7:49010: remote error: tls: bad certificate
2023/02/28 05:08:14 http: TLS handshake error from 10.128.0.7:49016: remote error: tls: bad certificate
2023/02/28 05:08:16 http: TLS handshake error from 10.128.0.7:49032: remote error: tls: bad certificate
Followed TLS Troubleshooting steps here: https://docs.gitlab.com/charts/installation/tls.html#troubleshooting
Checked to see if Let's Debug status was OK -- https://letsdebug.net/gitlab.apps.gl-on-ocp.rsb9.p1.openshiftapps.com. Initially said that the site openshiftapps.com was rate-limited on issuing certs, then became operational in subsequent tests.
Searched crt.sh for existing certs for my Openshift cluster -- there was one existing that I could pull for Option 2) Use your own wildcard certificate https://crt.sh/?q=rsb9.p1.openshiftapps.com
kubectl describe certificate,order,challenge -n=gitlab-system
Logs attached: kubectl-describe-certificate-order-challenge.txt
Deleted existing certificates to force-request fresh ones -- Same result.
Tested both nginx-ingress and Openshift Routes with these cert options without much success. Tried combinations of both global.ingress.class: none / gitlab-nginx, as well as configureCertManager: true / false
Even Option 4) Self-signed wilcard certificates was throwing bad cert challenges and created routes were stuck in pending due to pending challenges.
Primary error received with challenges/orders:
Reason: Waiting for HTTP-01 challenge propagation: wrong status code '503', expected '200' pending
Similar GitHub issue here, what the creator of this issue mentions is that the HTTP-01 challenge type is notoriously buggy with Openshift and he was able to get DNS-01 to work. https://github.com/cert-manager/cert-manager/issues/3307
- Deployed CertManager with default kubectl command as recommended by the link to CertManager Docs from GitLab Operator Install prerequisite documentation (echoing @niskhakova , can we get prescriptive recommendations on how to make CertManager work with GitLab?). I also later installed the CertManager operator, which allowed me to review what Issuers, Certificates and CertificateRequests were active in the Openshift CertManager Operator UI.
- Deployed Operator by following @dmakovey's SCC patch to the Operator, then applying the YAML manifest as documented. Tried both ingress-classes none and gitlab-nginx with clean installs of said YAML manifest.
- Values are exactly the same as what is recommended in the Operator Install docs. I have the wildcard router domain set as the host. I set the chart value to 6.9.0 as that is the latest value for OCP 4.11.27. Here is one such example manifest testing Openshift Routes:
kind: GitLab
apiVersion: apps.gitlab.com/v1beta1
metadata:
name: gitlab
namespace: gitlab-system
spec:
chart:
values:
certmanager:
install: false
nginx-ingress:
install: false
global:
hosts:
domain: <output of command from docs to pull base domain from wildcard router, e.g. apps.*****>
hostSuffix: null
ingress:
configureCertmanager: false
class: none
annotations:
# The OpenShift documentation says "edge" is the default, but
# the TLS configuration is only passed to the Route if this annotation
# is manually set.
route.openshift.io/termination: "edge"
version: 6.9.0
Since I am following documentation for the install exactly as described, do you have a known working copy of the GitLab CR with CertManager that I can test @mnielsen ? Tell me what versions of OCP, GitLab Operator, and Helm Chart I need as well as any prereqs. The more prescriptive the better, as I believe I've basically exhausted all paths at this point. Even went down the Acme challenge troubleshooting path without much success -- would really appreciate it!
Thanks, Bart