Migrate License Application from AWS VM to GKE with Auto Devops

Production Change - Criticality 2 C2

Change Objective The license.GitLab.com application is currently running in AWS as a VM. We will be moving it into an Auto Devops deployment in a GKE cluster: https://gitlab.com/gitlab-org/license-gitlab-com/-/issues/79
Change Type Operation
Services Impacted The license.GitLab.com application. This will not change any components of GitLab.com
Change Team Members @devin @tyleramos @rdavila
Change Criticality C2
Change Reviewer @aamarsanaa
Tested in staging Running in a review and staging environment here: https://gitlab.com/gitlab-org/license-gitlab-com/-/merge_requests/145
Dry-run output If the change is done through a script, it is mandatory to have a dry-run capability in the script, run the change in dry-run mode and output the result
Due Date 06/08/2020 23:00UTC - 13:00HST
Time tracking To estimate and record times associated with changes ( including a possible rollback )

Detailed steps for the change

Before starting, this MR needs to be merged. This will delete the .gitlab-ci.yml file to stop deploying to the old instance. https://gitlab.com/gitlab-org/license-gitlab-com/-/merge_requests/145

  1. Shut down the front end on the old node to make sure no further database changes are made
$ sudo service chef-client stop
$ sudo mv /etc/chef /etc/chef.change-2216
$ sudo service nginx stop
  1. From the old node, with the old database, run this script to copy the database to the new Cloud SQL instance.
#!/bin/sh
export LOCAL_PASS='XXXXXXX'
export REMOTE_PASS="XXXXXXX"
PGPASSWORD=${LOCAL_PASS} pg_dump -h 127.0.0.1 -U gitlab-license --clean --format=plain --no-owner --no-acl license_gitlab_com_production | sed -E 's/(DROP|CREATE|COMMENT ON) EXTENSION/-- \1 EXTENSION/g' | PGPASSWORD=${REMOTE_PASS}  psql -h 34.73.153.49 -U gitlab-license default
  1. Test the application at the production test URL: http://gitlab-org-license-gitlab-com.license-prd.gitlab.org/ - verify data is up to date
  2. Merge the DNS Change MR
  3. Add the environment variable ADDITIONAL_HOSTS=license.gitlab.com (This must be done after the DNS so that the certificate can be generated).
  4. Redeploy the production K8s pods from the license app's Environments page, to pick up the new name and certs
  5. Test the application to make sure:

Rollback steps

If no data has been written to the new instance, the rollback is as simple as reverting the DNS Change MR, and starting the front end on the old node

$ sudo mv /etc/chef.change-2216 /etc/chef 
$ sudo service chef-client start
$ sudo service nginx start

If data has been written, the script to copy the data back to the old instance is:

#!/bin/sh
export LOCAL_PASS='XXXXXXX'
export REMOTE_PASS="XXXXXXX"
PGPASSWORD=${REMOTE_PASS} pg_dump -h 34.73.153.49 -U gitlab-license --clean --format=plain --no-owner --no-acl default | sed -E 's/(DROP|CREATE|COMMENT ON) EXTENSION/-- \1 EXTENSION/g' | PGPASSWORD=${LOCAL_PASS}  psql -h 127.0.0.1 -U gitlab-license license_gitlab_com_production

If possible, copying back the data should be avoided since it is a destructive operation.

If we do need to roll back the changes and switch back to the old instance, we will need to revert the following MR: https://gitlab.com/gitlab-org/license-gitlab-com/-/merge_requests/145. This will allow subsequent deploys to continue going to the old instance until such time as we are ready to try the switch over again. This does not have to happen immediately while we are restoring service, but we don't want to forget this step.

Changes checklist

  • Detailed steps and rollback steps have been filled prior to commencing work
  • SRE on-call has been informed prior to change being rolled out
  • There are currently no open issues labeled as ServiceMonitoring with severities of ~S1 or ~S2
Edited by Omar Fernandez