Enable AutoDevOps for customers.gitlab.com
Problem
Parent issue in infrastructure: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/6795
Side note from Compliance that we will need to run customers in a separate VPC from gitlab-production and license, version, etc.
Proposal
Template for now - so we have this tied to our infrastructure work.
cc @tipyn
Result
Next steps (if any)
Template copied from version: Stages:
-
Build app into container -
Test stage (tests, code quality, security checks etc) -
Cluster integration (review apps, deployment)
Outstanding questions:
-
Determine if data will need to be migrated. If we can use the postgres DB deployed by Auto DevOps that would be ideal. If we need to migrate data then this may be tricky. Depending on whether downtime is acceptable we may be able to do the migration mostly manually by dumping the current DB and restoring it to the newly deployed Auto DevOps one. If the migration works out too tricky we can use Auto DevOps secret variables to set a URL to the old DB.
Steps to get this done:
-
Provision a GCP project that will run the GKE cluster (we can re-use existing GCP projects perhaps if infra team are ok with that) -
Create a GKE cluster on the given GCP project (should be done via the GitLab UI Operations > Kubernetes
and must be done by somebody with permissions to create clusters in this GCP project) -
Install cluster apps: Tiller, Runner, Ingress, Cert Manager, Prometheus -
Note down the Ingress IP address from above -
Set up a wildcard DNS like *.version.gitlap.info -> <ingress-ip-address>
-
Enable Auto DevOps ( Settings > CI/CD
) and remove the.gitlab-ci.yml
for this project -
Ensure the deployment goes through and the app is working correctly on the auto generated Auto DevOps URL (I think it would be gitlab-com-version-gitlab-com.version.gitlap.info
) -
Ensure SSL is working correctly (also ensure that http
redirects tohttps
automatically). NOTE that it can sometimes take several minutes for the certificate to be provisioned -
NOTE The following steps will cause Downtime since we need to switch the DNS entry to ensure that Let's Encrypt successfully provisions an SSL cert -
Change the DNS record for version.gitlab.com
to now point to the Ingress IP address -
Set a CI variable for the project PRODUCTION_ADDITIONAL_HOSTS=version.gitlab.com
-
Re-run the master
pipeline -
Confirm the app is now working correctly at version.gitlab.com
. Again noting that sometimes there is a delay of a few minutes to provision the SSL cert
How will we measure success?
Edited by Devin Sylva