Certmanager failing to issue multiple certificates randomly
Summary
I've observed certmanager failing to issue certificates for random components when attempting 3 consecutive fresh installs of the chart. First it failed for registry and openbao. Second it only succeeded for kas. Third it only failed for Omnibus.
In all the 3 cases I used the Gateway API. There were some differences:
- First I installed everything without openbao, and then updated the chart to enable openbao. In this version OpenBao still had a "bug" which it was deploying an ingress even when I had the gateway api enabled.
- Second, I installed openbao from scratch together with all the other chart components.
- Third, I disabled the OpenBao chart ingress explicitly.
I suspect that either:
- Certmanager struggles to create certificates for too many components at the same time.
- This might have been a thing already before the Gateway API, but might have become more evident with the Gateway API, as I've seen it 3 times in a row.
- Certmanager might get confused when there are two hosts with the same name (ingress and httproute).
It's possible that the two above are problems.
Versions
OpenBao:
app.kubernetes.io/version=v2.4.1-gitlab2
helm.sh/chart=openbao-0.12.0
Certmanager:
app.kubernetes.io/version=v1.17.4
helm.sh/chart=certmanager-v1.17.4
GitLab:
chart=webservice-9.7.1
Envoy Gateway:
helm.sh/chart=envoy-gateway-1.6.2
Workaround
For the third case, with openbao ingress forcedly disabled, deleting the secrets and certificates to force a reissue was sufficient to fix MinIO certificate.
Follow-up origin
The following discussion from !4728 (merged) should be addressed:
-
@Alexand started a discussion: (+8 comments)
Test OpenBao access
Check OpenBao is accessible is failing for me.
🔍 :curl -v https://openbao.${DOMAIN}/v1/sys/health * Host openbao.gitlab.jcunha.dev:443 was resolved. * IPv6: (none) * IPv4: REDACTED * Trying REDACTED:443... * Connected to openbao.${DOMAIN} (REDACTED) port 443 * ALPN: curl offers h2,http/1.1 * (304) (OUT), TLS handshake, Client hello (1): * CAfile: /etc/ssl/cert.pem * CApath: none * Recv failure: Connection reset by peer * LibreSSL/3.3.6: error:02FFF036:system library:func(4095):Connection reset by peer * Closing connection curl: (35) Recv failure: Connection reset by peer
Edited by João Alexandre Cunha