Move ephemeral VMs created by srm7 to a new GCP project
Rought steps
- Create a project: Runbook https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/uncategorized/gcp-project.md#new-project-creation
-
Create GCP project: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2388 -
Add GitLab CI to auto apply: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2389
-
- Check how big quotas we have (so we will know with what usage we're hitting them) - this includes both API quota and resources quota
-
Quotas difference: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12863#note_531309133 -
In-Use IP Address -
Read requests per 100 seconds -
Read requests per 100 seconds per user -
Operation read requests per 100 seconds -
Operation read requests per 100 seconds per user -
Compute Engine API - CPUs -
Compute Engine API - Committed CPUs -
Heavy-weight read requests per 100 seconds -
Heavy-weight read requests per 100 seconds per user -
Heavy-weight mutation requests per 100 seconds -
Heavy-weight mutation requests per 100 seconds per user
-
-
- Prepare Firewall rules for VPC
-
Add default network/service accounts: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2390
-
- Create service accounts in
gitlab-ci
and share it withgitlab-ci-plan-free-7
as shown in https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12863#note_535447649-
Create service account inside of gitlab-ci
: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2403 -
Invite service account of gitlab-ci
togitlab-ci-plan-free-7
project: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2406
-
- Allow the new project access to the images stored in gitlab-ci one. GCP Documentation which should probably be done inside terraform
-
Service account inside of gitlab-ci
hasroles/compute.imageUser
: https://ops.gitlab.net/gitlab-com/gitlab-com-infrastructure/-/merge_requests/2403/diffs
-
- Create firewall rules for the ephemeral VMs (common with our other projects; we will extract it to the module later)
- Create a VPC peering between gitlab-ci project and the new one.
-
Configure access to GCS bucket in gitlab-ci project (so that all jobs can still share cache)(Since the Runner Manager will exist in the same project as the cache bucket, we don't need this one since we send a pre resigned URL) - Configure https://ops.gitlab.net/gitlab-com/gl-infra/ci-project-cleaner to clean up machines in new project
-
Refactor merge request: https://ops.gitlab.net/gitlab-com/gl-infra/ci-project-cleaner/-/merge_requests/7 -
Merge request to add new project:
-
- Configure access for the team that is managing runners (so SREs + Steve, Georgi, Tomasz, and probably Elliot) to this new GCP project. We have a the
gcp-ci-ops-sg@gitlab.com
GCP group. We should ensure that all people listed here are added to it and that this group is added with proper permissions to the projects where CI runners are operating. - Configure the Prometheus servers for autoscaled VMs (not 100% needed for the proof-of-concept tests, but definitely needed for the final configuration). We should make sure that these servers are added to our Thanos cluster, so that Prometheus federation is no more used to collect metrics.
-
terraform MR here -
k8s workloads helmfiles MR here
-
Extra/Follow-ups (needs triage)
-
Update CIDR https://gitlab.com/gitlab-com/runbooks/-/blob/master/docs/ci-runners/README.md#network-info. Done in gitlab-com/runbooks!3446 (merged) -
Revisit CIDR of available IPs -
Prepare IP plan for the target configuration that we're designing (central project with Runner Managers connected directly with ephemeral VMs networks in different projects; k8s/prometheus network in the projects connecting with the ephemeral VMs network only within one project).
-
-
VPC Flow logs 👉 https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13104 -
Blocking IP firewall rules -
Extract the common firewall rules into a module 👉 Module created in https://gitlab.com/gitlab-com/gitlab-com-infrastructure/-/tree/68b53a3a14b736b77d576833961fb330406bafe6/modules/runner-manager-ephemeral-vms-project. Follow-
Create module -
Refactor environments/ci-plan-free-7
to use this module -
Refactor environments/org-ci
to use this module
-
-
org-ci
image sharing should be done inside terraform as well👉 -
Remove un-used network tags: https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/12863#note_538842692 👉 https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13105 -
Add runbook on how to pause a runner manager inside of the GitLab UI 👉 https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/13106
Edited by Steve Xuereb