2023-04-04: OIDC/OAuth Errors

Customer Impact

Gitlab.com users were unable to integrate with OIDC/OAuth between 2023-04-03 1200 UTC - 2023-04-05 01:59 UTC. GitLab is still investigating to get a better understanding of the full impact.

Current Status

IncidentMitigated

  • During the investigation of #8659 (closed) users reported an issue with JWTs generated via the OIDC flow, used in their Vault or CI setups failing to verify with the JWKS issued by GitLab.
  • During investigation it was identified that the issue was unrelated to the SSO redirection and stemmed from the door keeper dependency upgrade gitlab-org/gitlab!116142 (merged) that had introduced a breaking change on the key format (strings) in a patch upgrade considering it an internal API. This led to a mismatch with the JWKS being issued and one found in the JWT.
  • This incident was declared at 2023-04-04 18:18 UTC.
  • A revert MR was created gitlab-org/gitlab!116721 (merged)
  • Testing in a development environment confirmed the revert would resolve the issue.
  • After deployment, monitoring, and feedback from affected users, the issue was marked as mitigated at 2023-04-05 01:59 UTC

Corrective action that have been suggested so far include:

  • corrective action create docs for OIDC/OAuth test setup so that investigating those auth flows is easier

  • corrective action add synthetic monitoring for JWKS and token issue endpoints to identify incorrectly generated tokens/keys sooner.'

  • corrective action Add E2E tests for GitLab as OIDC provider: gitlab-org/quality/testcases#3984 (closed)


📝 Summary for CMOC notice / Exec summary:

  1. Customer Impact: Users integrating with OIDC/OAuth
  2. Service Impact: ServiceWeb
  3. Impact Duration: 2023-04-03 1200 UTC - 2023-04-05 01:59 UTC
  4. Root cause: RootCauseSoftware-Change

📚 References and helpful links

Recent Events (available internally only):

  • Feature Flag Log - Chatops to toggle Feature Flags Documentation
  • Infrastructure Configurations
  • GCP Events (e.g. host failure)

Deployment Guidance

  • Deployments Log | Gitlab.com Latest Updates
  • Reach out to Release Managers for S1/S2 incidents to discuss Rollbacks and/or Hot Patching | Rollback Runbook | Hot Patch Runbook

Use the following links to create related issues to this incident if additional work needs to be completed after it is resolved:

  • Corrective action ❙ Infradev
  • Incident Review ❙ Infra investigation followup
  • Confidential Support contact ❙ QA investigation

Note: In some cases we need to redact information from public view. We only do this in a limited number of documented cases. This might include the summary, timeline or any other bits of information, laid out in out handbook page. Any of this confidential data will be in a linked issue, only visible internally. By default, all information we can share, will be public, in accordance to our transparency value.

Edited Apr 06, 2023 by Anthony Fappiano
Assignee Loading
Time tracking Loading