Skip to content

2021-04-01 Addition to GitLab.com IP ranges

Production Change

Emergency maintenance

We are fast-tracking the addition of a second IP range to ensure availability for outgoing connections from GitLab.com. This update is a response to recurring scaling needs identified during recent incidents.

The additional range will be added at 2021-04-01 at 02:00 am UTC.

Please ensure that the IP ranges listed in https://docs.gitlab.com/ee/user/gitlab_com/#ip-range are added to your allowlists to avoid intermittent connection issues.

Change Summary

Related to #3981 (closed) and #4099 (closed), we are going to expand the IP range used by GitLab.com with CloudNAT. The Doc change is this MR: gitlab-org/gitlab!56770 (merged) The changes to merge in the true application/CloudNAT changes will be noted below.

Change Details

  1. Services Impacted - ServiceWeb
  2. Change Technician - @devin
  3. Change Criticality - C2
  4. Change Type - changescheduled
  5. Change Reviewer - @msmiley
  6. Due Date - 2021-04-01 02:00 UTC
  7. Time tracking - 5 min
  8. Downtime Component - No downtime

Detailed steps for the change

Pre-Change Steps - steps to be completed before execution of the change

  • merge in the documentation MR to update the IP ranges
  • announce a zero downtime maintenance window and tweet about the new range
  • Proceed with the change on 2021-04-01

Change Steps - steps to take to execute the change

Estimated Time to Complete (mins) - 20 minutes

Post-Change Steps - steps to take to verify the change

  • Post-Change Step 1

Rollback

Rollback steps - steps to be taken in the event of a need to rollback this change

Estimated Time to Complete (mins) - 20 min

Monitoring

Key metrics to observe

  • Metric: Allocated ports for the Cloud NAT
    • Location: Google console
    • What changes to this metric should prompt a rollback: Any extreme on both ends (more tthan 1 Million ports allocated, and less than 800k )

Summary of infrastructure changes

  • Does this change introduce new compute instances?
  • Does this change re-size any existing compute instances?
  • Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?

Summary of the above

Changes checklist

  • This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities.
  • This issue has the change technician as the assignee.
  • Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed.
  • Necessary approvals have been completed based on the Change Management Workflow.
  • Change has been tested in staging and results noted in a comment on this issue.
  • A dry-run has been conducted and results noted in a comment on this issue.
  • SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall and this issue and await their acknowledgement.)
  • There are currently no active incidents.
Edited by Devin Sylva