Migrate status.gitlab.com to use www.gitlabstatus.com for resilience
Production Change
Change Summary
{+ If there is a Cloudflare outage, reaching our status page should not be impeded. This change will migrate status.gitlab.com to www.gitlabstatus.com which is not dependent on Google Cloud or Cloudflare to operate. +}
Originating Issue: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10879#note_404888794
Change Details
- Services Impacted - status.gitlab.com
- Change Technician - @cmcfarland
- Change Criticality - C3
- Change Type - changescheduled
- Change Reviewer - DRI for the review of this change
- Due Date - 2020-10-30 18:00:00 UTC
- Time tracking - Time, in minutes, needed to execute all change steps, including rollback
- Downtime Component - 30 minutes
Detailed steps for the change
Pre-Change Steps - steps to be completed before execution of the change
Estimated Time to Complete (mins) - 5
-
Approve and Merge: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2030 -
Make sure you have a copy of the status.gitlab.com
certificate, key, and intermediary certs in case we need to roll back. -
Get Approved: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2155
Change Steps - steps to take to execute the change
Estimated Time to Complete (mins) - 30
-
Remove status.gitlab.com
as the custom web domain onstatus.io
. -
Add www.gitlabstatus.com
as the custom web domain onstatus.io
. -
Request a certificate for the custom web domain on status.io
. Use the Dedicated cert from AWS option. -
Update status.gitlab.com
cname by applying this MR: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2155
Post-Change Steps - steps to take to verify the change
Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes
-
Verify that status.gitlab.com
properly redirects towww.gitlabstatus.com
.curl -I http://status.gitlab.com
-
Verify that gitlabstatus.com
properly redirects towww.gitlabstatus.com
.curl -I https://gitlabstatus.com
-
Verify that www.gitlabstatus.com
has a proper SSL certificate in place.echo | openssl s_client -showcerts -servername www.gitlabstatus.com -connect www.gitlabstatus.com:443 2>/dev/null | openssl x509 -inform pem -noout -text | grep "Subject"
Rollback
Rollback steps - steps to be taken in the event of a need to rollback this change
Estimated Time to Complete (mins) - Estimated Time to Complete in Minutes
-
Remove any status.io configuration that has been modified. -
Re-implement the custom web domain of status.gitlab.com
in status.io and provide the SSL certificate. -
Revert the DNS change for status.gitlab.com.
Monitoring
Key metrics to observe
- Metric: GitLabStatus domain response
- Location: http://gitlabstatus.com
- What changes to this metric should prompt a rollback: This URL should redirect as below:
- http://gitlabstatus.com 301 Redirect to https://gitlabstatus.com
- https://gitlabstatus.com 301 Redirect to https://www.gitlabstatus.com
- https://www.gitlabstatus.com is a CNAME for the status.io page and should return a valid SSL encrypted page
Summary of infrastructure changes
-
Does this change introduce new compute instances? -
Does this change re-size any existing compute instances? -
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?
Summary of the above
Changes checklist
-
This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled). -
This issue has the change technician as the assignee. -
Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed. -
Necessary approvals have been completed based on the Change Management Workflow. -
Change has been tested in staging and resultes noted in a comment on this issue. -
A dry-run has been conducted and results noted in a comment on this issue. -
SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall
and this issue.) -
There are currently no active incidents.
Edited by Cameron McFarland