Duo Workflow Service crashes on startup when 'DUO_WORKFLOW_AUTH__OIDC_CUSTOMER_PORTAL_URL' is empty or CustomersDot is unreachable
Summary
The Duo Workflow Service (DWS) crashes on startup when it cannot prefetch JWKS keys from all configured OIDC providers. This is a functional blocker for offline-license customers with the DAP entitlement, because the documentation instructs them to set DUO_WORKFLOW_AUTH__OIDC_CUSTOMER_PORTAL_URL= (empty string), which triggers a MissingSchema error in the Cloud Connector library and causes the DWS to fail to start.
The main AI Gateway handles this scenario gracefully by logging the error and continuing with an incomplete JWKS cache. The DWS does not: it treats cloud_connector_ready() == False as a fatal error and raises AuthenticationError, preventing the service from starting.
Steps to reproduce
- Deploy a self-hosted AI Gateway for an offline-license GitLab instance with the DAP entitlement.
- Follow the documented configuration and set
DUO_WORKFLOW_AUTH__OIDC_CUSTOMER_PORTAL_URL=(empty string) on the AI Gateway container. - Start the AI Gateway container.
- Observe the DWS component crash with the following errors:
[error] Invalid URL '/.well-known/openid-configuration': No scheme supplied. Perhaps you meant https:///.well-known/openid-configuration?
[critical] Could not prefetch keys
[critical] Failed to initialize OIDC auth provider: Could not prefetch keysTraceback (most recent call last):
File "/home/aigateway/app/duo_workflow_service/interceptors/authentication_interceptor.py", line 132, in _init_oidc_auth_provider
raise AuthenticationError(error_msg)
duo_workflow_service.interceptors.authentication_interceptor.AuthenticationError: Could not prefetch keysThis also reproduces when DUO_WORKFLOW_AUTH__OIDC_CUSTOMER_PORTAL_URL is left unset (defaulting to https://customers.gitlab.com) but CustomersDot is unreachable from the container, for example due to network restrictions or a transient outage. In that case the error is HTTP 502 response from well_known instead of MissingSchema, but the outcome is the same: cloud_connector_ready() returns False and the DWS crashes.
Root cause
In duo_workflow_service/interceptors/authentication_interceptor.py, the DWS treats an incomplete JWKS cache as a hard startup failure:
if not cloud_connector_ready(provider):
error_msg = "Could not prefetch keys"
logger.fatal(error_msg)
raise AuthenticationError(error_msg)The main AI Gateway does not have this hard-fail behavior. It logs errors from failed OIDC provider fetches but continues operating with whatever keys it was able to retrieve.
Current behavior
The DWS crashes on startup if any configured OIDC provider is unreachable or returns an error during JWKS prefetch.
Expected behavior
The DWS should tolerate an incomplete JWKS cache when one or more OIDC providers are unavailable, consistent with how the main AI Gateway handles this scenario. At minimum, the DWS should be able to start and authenticate requests using the keys it was able to fetch (e.g., from the local GitLab instance).
Workaround
Set both AIGW_CUSTOMER_PORTAL_URL and DUO_WORKFLOW_AUTH__OIDC_CUSTOMER_PORTAL_URL to the local GitLab instance URL (e.g., https://<GitLab Instance FQDN>). This gives the CustomersDot OIDC provider a reachable endpoint that returns valid JWKS keys, allowing cloud_connector_ready() to succeed and the DWS to start.
Related issues
- #517089 — Gracefully handle missing/empty CDot OIDC provider URL in the Cloud Connector library (short-term fix, open)
- #517088 — Make OIDC provider list configurable rather than hardcoded (long-term fix, open)
- #520808 (closed) — Helm chart sets
AIGW_CUSTOMER_PORTAL_URL: ""when not explicitly configured (closed, 17.10) - #517083 (closed) — Audit of Cloud Connector behavior when
AIGW_CUSTOMER_PORTAL_URLis not set (closed)