Skip to content

Patroni and PgBouncer support for decomposed Rails database

Hossein Pursultani requested to merge 7353-decomposed-db-ha-poc into master

What does this MR do?

This MR enables Patroni and PgBouncer to work with the decomposed Rails database configuration.

It introduces the following changes:

  1. Stores the name of the decomposed databases in public attributes.
  2. gitlab-ctl pgb-notify accesses the public attributes to retrieve the name of Rails databases.
  3. gitlab-ctl pgb-notify points all Rails databases to use the new host and port.
  4. Omnibus creates the PgBouncer user with the auth function in all Rails databases.

Limitation

This is a workaround to address a very specific problem. For a complete solution we need to follow the recommendations of Blueprint to support multiple databases (!7172 - merged) and refactor the related parts.

This workaround only supports a scenario in which all databases are hosted on a single database cluster.

Test plan

You can use the Gitlab HA Test Kit to quickly run the following test scenarios using Docker Compose.

The following scenarios show that:

  1. The single-database configuration still works
  2. Upgrading from a prior version does not break
  3. The multi-database configuration works

Scenario 1: Install new instance with single database

  • New installation, with no prior data
  • Rails database configuration points to a single database
  • This is the current behaviour
export TEST_TAG='7353-decomposed-db-ha-poc'

## Or use SHA256 to pin the tag:
# export TEST_TAG='7353-decomposed-db-ha-poc@sha256:cfa90d75f86a9ca4e1b87fdc50625c8743c098e528f5c4f4658554872b0d4e25'

source script.sh

### Install ###
setup

## Both databases are the same
enter app cat /var/opt/gitlab/gitlab-rails/etc/database.yml | yq '.production[] | .database'

# gitlabhq_production
# gitlabhq_production

## Enter PgBouncer password, default is `secret`
configure_pgb

## Migrations must run successfully
run_migrations

## `gitlabhq_production` database must exist
enter patroni gitlab-psql -c '\l'

#        Name         |    Owner    | Encoding | Collate |  Ctype  |        Access privileges
#---------------------+-------------+----------+---------+---------+---------------------------------
# gitlabhq_production | gitlab      | UTF8     | C       | C.UTF-8 |

## `pg_shadow_lookup` function must exist
enter patroni gitlab-psql -c '\df' | grep pg_shadow_lookup

# public | pg_shadow_lookup                               | record                                                  | i_username text, OUT username text, OUT password text                     | func

## GitLab instance is usable
enter app curl -v -H 'Host: gitlab.test' GET 'http://localhost/-/readiness?all=1' | jq '.db_check'

### Cleanup ###
teardown

Scenario 2: Upgrade existing instance with single database

  • Upgrade existing installation, with prior data
  • Rails configuration points to a single database

Step 1: Install a new instance from an older version

export TEST_TAG='master'

source script.sh

### Install ###
setup

## Both databases are the same
enter app cat /var/opt/gitlab/gitlab-rails/etc/database.yml | yq '.production[] | .database'

# gitlabhq_production
# gitlabhq_production

## Enter PgBouncer password, default is `secret`
configure_pgb

## Migrations must run successfully
run_migrations

## GitLab instance is usable
enter app curl -v -H 'Host: gitlab.test' GET 'http://localhost/-/readiness?all=1' | jq '.db_check'

Step 2: Upgrade the previous instance

## Shut down Patroni cluster:
##   1. Find the leader, look for "I am (00000000000) the leader with the lock". For example `patroni/1` is the leader.
##   2. Shutdown replicas first
##   3. Shut down the leader last

logs patroni

gitlab_ctl patroni/2 stop patroni
gitlab_ctl patroni/2 stop patroni
gitlab_ctl patroni/1 stop patroni

### Upgrade ###

export TEST_TAG='7353-decomposed-db-ha-poc'

## Or use SHA256 to pin the tag:
# export TEST_TAG='7353-decomposed-db-ha-poc@sha256:cfa90d75f86a9ca4e1b87fdc50625c8743c098e528f5c4f4658554872b0d4e25'

setup

## Wait for upgrade to settle. Make sure:
##   1. Consul cluster has reached quorum and service check passes (see logs and use `/opt/gitlab/embedded/bin/consul info`)
##   2. Patroni has elected a leader (see `gitlab-ctl patroni members`)

## Migrations must run successfully
run_migrations

## GitLab instance is usable
enter app curl -v -H 'Host: gitlab.test' GET 'http://localhost/-/readiness?all=1' | jq '.db_check'

### Cleanup ###
teardown

Scenario 3: Install new instance with two databases

  • Must be a new installation, with no prior data
  • Rails database configuration points to a different databases
  • This is the new behaviour
export TEST_TAG='7353-decomposed-db-ha-poc'

## Or use SHA256 to pin the tag:
# export TEST_TAG='7353-decomposed-db-ha-poc@sha256:cfa90d75f86a9ca4e1b87fdc50625c8743c098e528f5c4f4658554872b0d4e25'

export TEST_APP_CFG='app-multidb.rb'
export TEST_PATRONI_CFG='patroni-multidb.rb'
export TEST_PGBOUNCER_CFG='pgbouncer-multidb.rb'

source script.sh

### Install ###
setup

## Databases are different
enter app cat /var/opt/gitlab/gitlab-rails/etc/database.yml | yq '.production[] | .database'

# gitlabhq_production
# gitlabhq_production_ci

## Enter PgBouncer password, default is `secret`
configure_pgb

## Check PgBouncer has two databases different databases pointing to the same host
enter pgbouncer cat /var/opt/gitlab/consul/databases.ini

# [databases]
#
# gitlabhq_production_ci = host=320ccb882a4e auth_user=pgbouncer dbname=gitlabhq_production_ci
#
# gitlabhq_production = host=320ccb882a4e auth_user=pgbouncer dbname=gitlabhq_production

## Migrations must run successfully
run_migrations

## Both `gitlabhq_production_ci` and `gitlabhq_production` databases must exist
enter patroni gitlab-psql -c '\l'

#          Name          |    Owner    | Encoding | Collate |  Ctype  |        Access privileges
#------------------------+-------------+----------+---------+---------+---------------------------------
# gitlabhq_production    | gitlab      | UTF8     | C       | C.UTF-8 |
# gitlabhq_production_ci | gitlab      | UTF8     | C       | C.UTF-8 |

## `pg_shadow_lookup` function must exist in both databases
enter patroni gitlab-psql -c '\df' -d gitlabhq_production | grep pg_shadow_lookup

# public | pg_shadow_lookup                               | record                                                  | i_username text, OUT username text, OUT password text                     | func

enter patroni gitlab-psql -c '\df' -d gitlabhq_production_ci | grep pg_shadow_lookup

# public | pg_shadow_lookup                               | record                                                  | i_username text, OUT username text, OUT password text                     | func

## GitLab instance is usable
enter app curl -v -H 'Host: gitlab.test' GET 'http://localhost/-/readiness?all=1' | jq '.db_check'

## Switch Patroni leader
##   1. List members and select a replica
##   2. Switch over to replica
##   3. Wait for replica to take over
gitlab_ctl patroni patroni members

# Current cluster topology
# + Cluster: postgresql-ha (7333047514023577262) --+----+-----------+
# | Member       | Host        | Role    | State   | TL | Lag in MB |
# +--------------+-------------+---------+---------+----+-----------+
# | 04368fbe42dc | 172.28.28.7 | Replica | running |  1 |         0 |
# | 13c4c0b44e89 | 172.28.28.6 | Replica | running |  1 |         0 |
# | 320ccb882a4e | 172.28.28.5 | Leader  | running |  1 |           |
# +--------------+-------------+---------+---------+----+-----------+

gitlab_ctl patroni patroni switchover 13c4c0b44e89 # <= "REPLICA HOST NAME"

# Successfully switched over to "13c4c0b44e89"

# Wait
gitlab_ctl patroni patroni members

# + Cluster: postgresql-ha (7333047514023577262) --+----+-----------+
# | Member       | Host        | Role    | State   | TL | Lag in MB |
# +--------------+-------------+---------+---------+----+-----------+
# | 04368fbe42dc | 172.28.28.7 | Replica | running |  2 |         2 |
# | 13c4c0b44e89 | 172.28.28.6 | Leader  | running |  2 |           |
# | 320ccb882a4e | 172.28.28.5 | Replica | running |  2 |         2 |
# +--------------+-------------+---------+---------+----+-----------+

## PgBouncer is updated and both databases point to the new leader
enter pgbouncer cat /var/opt/gitlab/consul/databases.ini

# [databases]
#
# gitlabhq_production_ci = host=13c4c0b44e89 auth_user=pgbouncer dbname=gitlabhq_production_ci
#
# gitlabhq_production = host=13c4c0b44e89 auth_user=pgbouncer dbname=gitlabhq_production

## GitLab instance is usable
enter app curl -v -H 'Host: gitlab.test' GET 'http://localhost/-/readiness?all=1' | jq '.db_check'

### Cleanup ###
teardown

Related issues

Closes #7353 (closed)

Checklist

See Definition of done.

For anything in this list which will not be completed, please provide a reason in the MR discussion.

Required

  • MR title and description are up to date, accurate, and descriptive.
  • MR targeting the appropriate branch.
  • Latest Merge Result pipeline is green.
  • When ready for review, MR is labeled "~workflow::ready for review" per the Distribution MR workflow.

For GitLab team members

If you don't have access to this, the reviewer should trigger these jobs for you during the review process.

  • The manual Trigger:ee-package jobs have a green pipeline running against latest commit.
  • If config/software or config/patches directories are changed, make sure the build-package-on-all-os job within the Trigger:ee-package downstream pipeline succeeded.
  • If you are changing anything SSL related, then the Trigger:package:fips manual job within the Trigger:ee-package downstream pipeline must succeed.
  • If CI configuration is changed, the branch must be pushed to dev.gitlab.org to confirm regular branch builds aren't broken.

Expected (please provide an explanation if not completing)

  • Test plan indicating conditions for success has been posted and passes.
  • Documentation created/updated.
  • Tests added.
  • Integration tests added to GitLab QA.
  • Equivalent MR/issue for the GitLab Chart opened.
  • Validate potential values for new configuration settings. Formats such as integer 10, duration 10s, URI scheme://user:passwd@host:port may require quotation or other special handling when rendered in a template and written to a configuration file.
Edited by Hossein Pursultani

Merge request reports