Skip to content

Request API token for SOX Audit Events Report

Production Change

Change Summary

New JWT signing keys used in SOX Audit Events reporting production environments:

  • sox_audit_events_jwt_signing_key

Change Details

  1. Services Impacted - ServiceInfrastructure
  2. Change Technician - Change Technician
  3. Change Reviewer - DRI for the review of this change
  4. Scheduled Date and Time (UTC in format YYYY-MM-DD HH:MM) - Start date and time planned to execute change steps YYYY-MM-DD HH:MM
  5. Time tracking - This change should only involve code changes in CDot. No manual work should be required.
  6. Downtime Component - none

Set Maintenance Mode in GitLab

If your change involves scheduled maintenance, add a step to set and unset maintenance mode per our runbooks. This will make sure SLA calculations adjust for the maintenance period.

Detailed steps for the change

Change steps - steps to take to execute the change

Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes

  • Set label changein-progress /label ~change::in-progress

  • Generate new RSA private keys by running bin/generate_rsa_key_pair for each non-production env.

  • Open MR to add the new *_jwt_signing_key values in credentials.yml.enc for each environment with the keys generated from previous step.

  • Merge MR to deploy this change: the JWKS endpoint should now serve the new key

    • Check if discovery jwks endpoint returns the new keys
    • Monitor service logs of the validating service to ensure there is no increase in 401s.
      • Refer to the respective service runbook for how to do this.
      • For CDot internal API call to GitLab, check the Kibana logs for any anomaly.
  • Set label changecomplete /label ~change::complete

Rollback

Rollback steps - steps to be taken in the event of a need to rollback this change

Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes

  • Remove the new *_jwt_signing_key values from credentials.yml.enc
  • Deploy the rollback change
  • Verify that the JWKS endpoint no longer includes the new keys
  • Set label changeaborted /label ~change::aborted

Monitoring

Key metrics to observe

  • Check if discovery jwks endpoint returns the new keys
  • Monitor service logs of the validating service to ensure there is no increase in 401s.
    • Refer to the respective service runbook for how to do this.
    • For CDot internal API call to GitLab, check the Kibana logs for any anomaly.

Change Reviewer checklist

C4 C3 C2 C1:

  • Check if the following applies:
    • The scheduled day and time of execution of the change is appropriate.
    • The change plan is technically accurate.
    • The change plan includes estimated timing values based on previous testing.
    • The change plan includes a viable rollback plan.
    • The specified metrics/monitoring dashboards provide sufficient visibility for the change.

C2 C1:

  • Check if the following applies:
    • The complexity of the plan is appropriate for the corresponding risk of the change. (i.e. the plan contains clear details).
    • The change plan includes success measures for all steps/milestones during the execution.
    • The change adequately minimizes risk within the environment/service.
    • The performance implications of executing the change are well-understood and documented.
    • The specified metrics/monitoring dashboards provide sufficient visibility for the change.
      • If not, is it possible (or necessary) to make changes to observability platforms for added visibility?
    • The change has a primary and secondary SRE with knowledge of the details available during the change window.
    • The change window has been agreed with Release Managers in advance of the change. If the change is planned for APAC hours, this issue has an agreed pre-change approval.
    • The labels blocks deployments and/or blocks feature-flags are applied as necessary.

Change Technician checklist

  • The change plan is technically accurate.
  • This Change Issue is linked to the appropriate Issue and/or Epic
  • Change has been tested in staging and results noted in a comment on this issue.
  • A dry-run has been conducted and results noted in a comment on this issue.
  • The change execution window respects the Production Change Lock periods.
  • For C1 and C2 change issues, the change event is added to the GitLab Production calendar.
  • For C1 and C2 change issues, the Infrastructure Manager provided approval with the manager_approved label on the issue. Mention @gitlab-org/saas-platforms/inframanagers in this issue to request approval and provide visibility to all infrastructure managers.
  • For C1, C2, or blocks deployments change issues, confirm with Release managers that the change does not overlap or hinder any release process (In #production channel, mention @release-managers and this issue and await their acknowledgment.)
Edited by Matt Sroufe