Skip to content
Snippets Groups Projects
Commit ff5e956d authored by Chris Childers's avatar Chris Childers
Browse files

Adding details on how to migrate PostgreSQL data, and properly set the...

Adding details on how to migrate PostgreSQL data, and properly set the workshop to use Amazon ElastiCache for Redis
parent 1c44abe3
No related branches found
No related tags found
1 merge request!22Resolve Problem 2 on "Updates to the aws-3k-cloud-managed-services template"
# AWS 3K Reference Architecture with cloud managed services
:exclamation: This is a continuation of the AWS 3K deployment workshop with GET. It assumes you have a working deployment of GitLab that was deployed with GET. If you don't have this, create a [an issue](https://gitlab.com/gitlab-org/professional-services-automation/tools/implementation/GET-deployment-workshop/-/issues/new?issuable_template=aws-3k-deployment-workshop.md) complete the prerequesite steps.
:exclamation: This is a continuation of the AWS 3K deployment workshop with GET. It assumes you have a working deployment of GitLab that was deployed with GET. If you don't have this, create a [an issue](https://gitlab.com/gitlab-org/professional-services-automation/tools/implementation/GET-deployment-workshop/-/issues/new?issuable_template=aws-3k-deployment-workshop.md) complete the prerequesite steps. There is an optional step to migrate your data from the previous issue's PostgreSQL database.
> This workshop will focus on replacing some of the stateful components that were originally deployed using VMs, now with cloud managed services.
> This workshop will focus on replacing some of the stateful components that were originally deployed using VMs, now with cloud managed services.
## Replacing Omnibus Components for AWS managed Services
We are going to be replacing Postgres/Patroni and Pgbouncer for RDS. Redis/Sentinel for Elasticache. And the Internal Load Balancer that sits in front of praefect for NLB on ELB. We are going to be doing this step-by-step. So, first, let's use GET to provision these components for us.
1. [ ] On the instance terminal edit `gitlab-environment-toolkit/terraform/environments/3k/variables.tf` including Elasticache and RDS passwords:
```
variable "elasticache_redis_password" {
type = string
}
variable "rds_postgres_password" {
type = string
}
```
2. [ ] Create `gitlab-environment-toolkit/terraform/environments/3k/outputs.tf` files to access the internal module outputs to retrive Elasticache, RDS and ILB details:
```
output "rds_postgres_connection" {
value = try(module.gitlab_ref_arch_aws.rds_postgres_connection, [])
}
output "elasticache_redis_persistent_connection" {
value = try(module.gitlab_ref_arch_aws.elasticache_redis_connection, [])
}
output "gitlab_internal_load_balancer_dns" {
value = try(module.gitlab_ref_arch_aws.elb_internal.elb_internal_host, [])
}
```
3. [ ] On the instance terminal edit `gitlab-environment-toolkit/terraform/environments/3k/environment.tf` adding GET required entries to provision [Elasticache](https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/blob/main/docs/environment_advanced_services.md#aws-elasticache) and [RDS](https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/blob/main/docs/environment_advanced_services.md#aws-rds) and deploy and Internal Load Balancer on ELB.
```
# ILB
elb_internal_create = true
# RDS
rds_postgres_instance_type = "m5.2xlarge"
rds_postgres_password = var.rds_postgres_password
# Elasticcache
elasticache_redis_node_count = 2
elasticache_redis_instance_type = "m5.large"
elasticache_redis_password = var.elasticache_redis_password
```
4. [ ] Before running Terraform we need to export environment variables `elasticache_redis_password` and `rds_postgres_password` we can use the same password already set on `$GITLAB_PASSWORD`
```
export TF_VAR_rds_postgres_password=$GITLAB_PASSWORD
export TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD
```
5. [ ] Run a Toolkit Docker Container adding the new variables. The command should look like following:
```
docker run -it \
-v /home/ec2-user/gitlab-environment-toolkit/keys:/gitlab-environment-toolkit/keys \
-v /home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k:/gitlab-environment-toolkit/ansible/environments/3k \
-v /home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k:/gitlab-environment-toolkit/terraform/environments/3k \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e GITLAB_PASSWORD=$GITLAB_PASSWORD \
-e TF_VAR_rds_postgres_password=$GITLAB_PASSWORD \
-e TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD \
registry.gitlab.com/gitlab-org/gitlab-environment-toolkit:latest
```
6. [ ] Inside the container run the commands below:
We are going to be replacing PostgreSQL/Patroni and PgBouncer with [Amazon RDS](https://aws.amazon.com/rds/), Redis/Sentinel with [Amazon ElastiCache](https://aws.amazon.com/elasticache/)]. And the Internal HAProxy Load Balancer that sits in front of praefect with a [Network Load Balancer](https://aws.amazon.com/elasticloadbalancing/network-load-balancer/). This transition will be done incrementally.
First, we'll modify the GET Terraform to provision these components for us.
1. [ ] On the instance terminal edit `gitlab-environment-toolkit/terraform/environments/3k/variables.tf` to define ElastiCache and RDS passwords:
```
variable "elasticache_redis_password" {
type = string
}
variable "rds_postgres_password" {
type = string
}
```
1. [ ] Create `gitlab-environment-toolkit/terraform/environments/3k/outputs.tf` files to access the internal module outputs to retrive ElastiCache, RDS and Internal Load Balancer details:
```
output "rds_postgres_connection" {
value = try(module.gitlab_ref_arch_aws.rds_postgres_connection, [])
}
output "elasticache_redis_persistent_connection" {
value = try(module.gitlab_ref_arch_aws.elasticache_redis_connection, [])
}
output "gitlab_internal_load_balancer_dns" {
value = try(module.gitlab_ref_arch_aws.elb_internal.elb_internal_host, [])
}
```
1. [ ] On the instance terminal edit `gitlab-environment-toolkit/terraform/environments/3k/environment.tf` adding GET required entries to provision [Elasticache](https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/blob/main/docs/environment_advanced_services.md#aws-elasticache) and [RDS](https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/blob/main/docs/environment_advanced_services.md#aws-rds) and deploy and Internal Load Balancer on ELB.
```
# ILB
elb_internal_create = true
# RDS
rds_postgres_instance_type = "m5.2xlarge"
rds_postgres_password = var.rds_postgres_password
# Elasticcache
elasticache_redis_node_count = 2
elasticache_redis_instance_type = "m5.large"
elasticache_redis_password = var.elasticache_redis_password
```
1. [ ] Remove the Redis nodes from the same file. If these nodes aren't removed, the Ansible playbook will not update the `/etc/gitlab/gitlab.rb` file with the appropriate Redis configuration. More details can be found in [Issue 758](https://gitlab.com/gitlab-org/professional-services-automation/tools/implementation/get-deployment-workshop/-/issues/758#problem-2-redis-to-elasticache-instructions)
```
# Redis
# redis_node_count = 3
# redis_instance_type = "m5.large"
```
1. [ ] Before running Terraform we need to export environment variables `elasticache_redis_password` and `rds_postgres_password` we can use the same password already set on `$GITLAB_PASSWORD`
```
export TF_VAR_rds_postgres_password=$GITLAB_PASSWORD
export TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD
```
1. [ ] Run a Toolkit Docker Container adding the new variables. The command should look like following:
```
docker run -it \
-v /home/ec2-user/gitlab-environment-toolkit/keys:/gitlab-environment-toolkit/keys \
-v /home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k:/gitlab-environment-toolkit/ansible/environments/3k \
-v /home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k:/gitlab-environment-toolkit/terraform/environments/3k \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e GITLAB_PASSWORD=$GITLAB_PASSWORD \
-e TF_VAR_rds_postgres_password=$GITLAB_PASSWORD \
-e TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD \
registry.gitlab.com/gitlab-org/gitlab-environment-toolkit:latest
```
1. [ ] Inside the container run the commands below:
- [ ] Install Terraform if it's not present with `mise install terraform -y`
- [ ] Run `cd /gitlab-environment-toolkit/terraform/environments/3k/`
- [ ] Run `terraform apply`. This should add 20 new resources, and usually take 8-10 minutes
- [ ] Run `terraform output`. Copy the output and post as a comment on this issue
Now let's configure GitLab to use the RDS, Elasticache, and the ELB. We going to do that in three steps:
1. Let's configure GitLab to use the new resources;
1. Check that the application is still working
1. Then remove unnecessary components;
> :exclamation: Your GitLab instance is unavailable at this point due to the removed Redis nodes. Do not do this in a production environment.
Now let's configure GitLab to use the RDS, Elasticache, and the ELB. We going to do that in three steps:
1. Let's configure GitLab to use the new resources;
1. Check that the application is working
1. Optionally migrate the data
1. Remove the remaining unnecessary components
7. [ ] In the instance edit `/home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k/inventory/vars.yml` adding the following lines:
1. [ ] In the instance edit `/home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k/inventory/vars.yml` adding the following lines:
- [ ] Replace `<internal_lb_host>`, `<redis_host>` and `<postgres_host>` respectively for the terraform outputs: `gitlab_internal_load_balancer_dns`, `elasticache_redis_address` and `rds_host`.
```
postgres_external: true
internal_lb_host: "<internal_lb_host>"
redis_host: "<redis_host>"
postgres_host: "<postgres_host>"
```
Something similar to the following:
```
postgres_external: true
internal_lb_host: "gitlab-afonseca-3k-paris-int-4353bac0a6a67bb2.elb.eu-west-3.amazonaws.com"
redis_host: "master.gitlab-afonseca-3k-paris-redis.jrodnd.euw3.cache.amazonaws.com"
postgres_host: "gitlab-afonseca-3k-paris-rds.coy2o6w62emn.eu-west-3.rds.amazonaws.com"
```
8. [ ] Inside the container let's run the Ansible scripts
```
postgres_external: true
internal_lb_host: "<internal_lb_host>"
redis_host: "<redis_host>"
postgres_host: "<postgres_host>"
```
Something similar to the following:
```
postgres_external: true
internal_lb_host: "gitlab-afonseca-3k-paris-int-4353bac0a6a67bb2.elb.eu-west-3.amazonaws.com"
redis_host: "master.gitlab-afonseca-3k-paris-redis.jrodnd.euw3.cache.amazonaws.com"
postgres_host: "gitlab-afonseca-3k-paris-rds.coy2o6w62emn.eu-west-3.rds.amazonaws.com"
```
1. [ ] :warning: Optional - Backup your PostgreSQL data. We'll be replacing the database. Git data will still live on Gitaly, but PostgreSQL has the required metadata to render it in the web application.
1. Access a rails node and validate S3 is being used as the backup location
- [ ] run `ssh -i /home/ec2-user/gitlab-environment-toolkit/keys/id_rsa ubuntu@your_rails_public_dns`
- [ ] run `sudo vim /etc/gitlab/gitlab.rb`
- [ ] retrieve the name of your S3 bucket from `gitlab_rails['backup_upload_remote_directory']`
1. [ ] Perform a [GitLab Backup](https://docs.gitlab.com/ee/administration/backup_restore/backup_gitlab.html#backup-command) by running `sudo gitlab-backup create`
1. [ ] Validate the backup is in the bucket listed above
1. [ ] Inside the container let's run the Ansible scripts
- [ ] Run `cd /gitlab-environment-toolkit/ansible`
- [ ] Test connection with hosts `ansible all -m ping -i environments/3k/inventory --list-hosts`
- [ ] Run `ansible-playbook -i environments/3k/inventory/ playbooks/all.yml`
- [ ] Manual fix of the task `Get Omnibus Postgres Primary` https://gitlab.com/gitlab-org/gitlab-environment-toolkit/-/merge_requests/555 when running GET older than `2.0.1`
- [ ] Once the process is complete exit the container with `exit`
9. [ ] Access a rails node and confirm that is using the RDS and Elasticache in the terminal run:
1. [ ] Access a rails node and confirm that is using the RDS and Elasticache in the terminal run:
- [ ] run `ssh -i /home/ec2-user/gitlab-environment-toolkit/keys/id_rsa ubuntu@your_rails_public_dns`
- [ ] run `sudo vim /etc/gitlab/gitlab.rb`
- [ ] Check the configurations: `gitlab_rails['db_host']`, `gitlab_rails['redis_host']` and `gitaly_address`. They must be pointing for `postgres_host`, `redis_host` and `internal_lb_host` provide on step `7`
10. [ ] Test the GitLab application using `http://<your_elastic_ip>` the same information provided on `external_url` in the `vars.yml`. With user `root` and the password as set in the variable `GITLAB_PASSWORD`. If everything is working as expected move to the next step.
11. [ ] Remove the Redis, Postgres, Pgbouncer and HAProxy Internal editing `/home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k/environment.tf` and commenting/removing the lines below:
```
# redis_node_count = 3
# redis_instance_type = "m5.large"
# postgres_node_count = 3
# postgres_instance_type = "m5.large"
# pgbouncer_node_count = 3
# pgbouncer_instance_type = "c5.large"
# haproxy_internal_node_count = 1
# haproxy_internal_instance_type = "c5.large"
```
12. [ ] Run the Toolkit's container running the command below:
```
docker run -it \
-v /home/ec2-user/gitlab-environment-toolkit/keys:/gitlab-environment-toolkit/keys \
-v /home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k:/gitlab-environment-toolkit/ansible/environments/3k \
-v /home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k:/gitlab-environment-toolkit/terraform/environments/3k \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e GITLAB_PASSWORD=$GITLAB_PASSWORD \
-e TF_VAR_rds_postgres_password=$GITLAB_PASSWORD \
-e TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD \
registry.gitlab.com/gitlab-org/gitlab-environment-toolkit:latest
```
13. [ ] Inside the container run the commands that follow:
1. [ ] Test the GitLab application using `http://<your_elastic_ip>` the same information provided on `external_url` in the `vars.yml`. With user `root` and the password as set in the variable `GITLAB_PASSWORD`. If everything is working as expected move to the next step.
1. [ ] :warning: Optional - Restore your PostgreSQL data if you backed it up earlier.
1. [ ] Access a rails node to perform the [GitLab restore](https://docs.gitlab.com/ee/administration/backup_restore/restore_gitlab.html)
- [ ] run `ssh -i /home/ec2-user/gitlab-environment-toolkit/keys/id_rsa ubuntu@your_rails_public_dns`
1. [ ] Install the `awscli` following [AWS instructions](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html)
1. [ ] Use the awscli to pull the backup to the local instance
- [ ] run `sudo aws s3 cp s3://<your_bucket>/<your_backup>.tar /var/opt/gitlab/backups/
- [ ] set the ownership to the git user with `sudo chown git:git /var/opt/gitlab/backups/<your_backup>.tar`
1. [ ] Stop the required gitlab processes that access the database
- [ ] `sudo gitlab-ctl stop puma`
- [ ] `sudo gitlab-ctl stop sidekiq`
1. [ ] Restore your backup (notice the `_gitlab_backup.tar` suffix is dropped)
- [ ] `sudo gitlab-backup restore BACKUP=<your_backup_without_gitlab_backup.tar>`
- Example, if your backup is `11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar`, you would run `sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce`
1. [ ] Restart the GitLab service with `sudo gitlab-ctl restart` and check it with `sudo gitlab-rake gitlab:check SANITIZE=true`
1. [ ] Validate your project data is now accessible through the web application
1. [ ] Remove the PostgreSQL, PgBouncer and HAProxy Internal Load Balancer editing `/home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k/environment.tf` and commenting/removing the lines below:
```
# postgres_node_count = 3
# postgres_instance_type = "m5.large"
# pgbouncer_node_count = 3
# pgbouncer_instance_type = "c5.large"
# haproxy_internal_node_count = 1
# haproxy_internal_instance_type = "c5.large"
```
1. [ ] Run the Toolkit's container running the command below:
```
docker run -it \
-v /home/ec2-user/gitlab-environment-toolkit/keys:/gitlab-environment-toolkit/keys \
-v /home/ec2-user/gitlab-environment-toolkit/ansible/environments/3k:/gitlab-environment-toolkit/ansible/environments/3k \
-v /home/ec2-user/gitlab-environment-toolkit/terraform/environments/3k:/gitlab-environment-toolkit/terraform/environments/3k \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-e GITLAB_PASSWORD=$GITLAB_PASSWORD \
-e TF_VAR_rds_postgres_password=$GITLAB_PASSWORD \
-e TF_VAR_elasticache_redis_password=$GITLAB_PASSWORD \
registry.gitlab.com/gitlab-org/gitlab-environment-toolkit:latest
```
1. [ ] Inside the container run the commands that follow:
- [ ] `cd /gitlab-environment-toolkit/terraform/environments/3k`
- [ ] `terraform apply` this should destroy 19 resources
This should not affect the application, since the components are not being used anymore. Once you are done with the environment dont fortget to remove all the deployed resources on AWS, going over the next and final step.
This should not affect the application, since the components are not being used anymore. Once you are done with the environment dont fortget to remove all the deployed resources on AWS, going over the next and final step.
14. [ ] Inside container the container once you are done with your environment don't forget to tear it down;
1. [ ] Inside container the container once you are done with your environment don't forget to tear it down;
- [ ] `cd /gitlab-environment-toolkit/terraform/environments/3k`
- [ ] `terraform destroy`
- [ ] Terminate GET instance manually through AWS console
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment