GitLab Breaks With a 500 Error When Using Aurora RDS and Database Read Load Balancing is Enabled

When GitLab uses Aurora RDS for the GitLab Database, there is a known issue where the Aurora Read Replicas are not used for reads by GitLab. This is because GitLab performs an extra step of validating replication latency of a read replica is acceptable before allowing it to be used for database read operations. For the API call made for this information, Aurora responds with different data than other PostgreSQL implementations.

Error Condition Root Cause

Conditions Under Which GitLab Generates a 500 Error with Aurora

  • GitLab DB is AWS Aurora
  • WITH at least one Aurora read replica configured
  • WITH GitLab db load balancing enabled (either forced on or by default)
  • WITH a GitLab 13.x EE install WITH a license installed OR
  • WITH any Gitlab 14.x EE install (because a license is no longer needed and DB Load Balancing Defaults to On)

Workarounds

  • Configure Aurora with a single instance (no read replicas)
    • Be sure to account for both read and write loading when deciding on Aurora primary instance type
  • Disable GitLab Read Replica Load Balancing
    • Be sure to account for both read and write loading when deciding on Aurora primary instance type
    • For GitLab Omnibus, remove gitlab_rails['db_load_balancing'] configuration from gitlab.rb
    • For GitLab Cloud Native Hybrid: Remove the load_balancing: section following from your Kubernetes configuration and reapply. Here is an example:
          global:
            psql:
              load_balancing: #do not include this section or remove it to update existing clusters
                hosts:
                  - your.aurora.endpoint.url

Non-Aurora AWS RDS PostgreSQL does not have this issue, so it is the supported RDS on AWS at this time - most especially in large scale instances where database load balancing may affect cost engineering.

You can subscribe to this issue to follow progress and resolution status.

If there are additional related issues they will be added to the section Linked issues below.

Edited by DarwinJS