Surface Lets Encrypt certificate request errors in GitLab Pages
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
Proposal
GitLab Pages can obtain a Lets Encrypt certificate. Through the ACME verification challenge, different errors can occur, where the UI does not tell in much detail.
My suggestion is surface more errors to the user to help them debug in the right direction.
Additional Context
For example, when a DNS record returns SERVFAIL, the certificate requests bailout. This is hard to debug for users who are not DNS experts, and cannot see the server logs on GitLab.com SaaS. Context with analysis in #35940 (comment 1606914298)
There are concerns to not surface every error to users.
we should already have all the data. In the past, we had some concerns about surfacing it to the user, but I guess we can do it behind the feature flag.
Maybe we can filter for specific error types returned by the Acme::Client::Error class, for example Dns in the first iteration.
I'm not deeply familiar with the GitLab Lets Encrypt requests implementation -- is there a way to fetch the raw error message? I would assume that the ACME challenge returns something like
SERVFAILor DNS specific context when the challenge fails. Looking at https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/pages_domains/obtain_lets_encrypt_certificate_service.rb#L62 I assume the code already fetches the raw errors but maybe there are more options. Maybe with capturing Acme::Client::Error types but I'm not sure if that is a good path forward.