Skip to content

Geo: adds database load balancing support to work on standby clusters

Gabriel Mazetto requested to merge 283971-fix-geo-database-load-balancer into master

What does this MR do?

Atempt 1

It adds a fallback query for the database load-balancer logic to work correctly with Standby Leaders.

The other alternative solution requires querying GeoNode for information (either as part of Gitlab::Database.read_only?). A third solution would require us to keep a configuration flag up to date in config/gitlab.yml.

I think current approach sounds like a good compromise between the Gitlab::Database.read_only?, which would require an extra query anyways and having to add another configuration flag.

Console example

[1] pry(main)> lb = Gitlab::Database::LoadBalancing::LoadBalancer.new(Gitlab::Database::LoadBalancing.hosts)
=> #<Gitlab::Database::LoadBalancing::LoadBalancer:0x00007fcf05632ed8
 @host_list=
  #<Gitlab::Database::LoadBalancing::HostList:0x00007fced7816b88
   @hosts=[],
   @hosts_gauge=#<Gitlab::Metrics::NullMetric:0x00007fced7815058>,
   @index=0,
   @mutex=#<Thread::Mutex:0x00007fced7816b38>>>
[2] pry(main)> lb.primary_write_location
   (1.2ms)  SELECT pg_current_wal_insert_lsn()::text AS location
   (0.2ms)  SELECT pg_last_wal_replay_lsn()::text AS location
=> "0/C73A0D88"

Attempt 2 (current)

I've removed the downside part of the previous attempt above. It now uses a single query with a case statement, and requires no exception catch and handling.

Console example

[1] pry(main)> lb = Gitlab::Database::LoadBalancing::LoadBalancer.new(Gitlab::Database::LoadBalancing.hosts)

=> #<Gitlab::Database::LoadBalancing::LoadBalancer:0x00007f9cbd125e38
 @host_list=
  #<Gitlab::Database::LoadBalancing::HostList:0x00007f9cbd124d80
   @hosts=
    [#<Gitlab::Database::LoadBalancing::Host:0x00007f9cbd125de8
      @host="localhost",
      @intervals=
       [60.0,
        60.5,...
  
[2] pry(main)> lb.primary_write_location
   (0.7ms)  SELECT CASE WHEN pg_is_in_recovery() = true THEN pg_last_wal_replay_lsn()::text ELSE pg_current_wal_insert_lsn()::text END AS location;
=> "0/C74C1BD0"

The example above should work in both geo primary and secondary (in a standby cluster) with a load balancer.

The query formatted itself is the following:

SELECT CASE
           WHEN PG_IS_IN_RECOVERY() = TRUE 
               THEN pg_last_wal_replay_lsn()::text
           ELSE 
               pg_current_wal_insert_lsn()::text 
    END AS location;

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

To test this you need to follow https://docs.gitlab.com/ee/administration/database_load_balancing.html to enable the load balancing support, try to navigate to a few pages and watch log/database_load_balancing.log.

There is no way to "mock" the database to be in recovery mode, as we are relying exclusively on the SQL query here for performance reasons.

Here is what I used in config/database.yml test on a local machine:

development:
  adapter: postgresql
  encoding: unicode
  database: gitlabhq_development
  host: 127.0.0.1
  port: 5434
  pool: 10
  prepared_statements: false
  variables:
    statement_timeout: 120s
  load_balancing:
    hosts:
      - localhost

And this is what I saw in log/database_load_balancing.log:

{"severity":"INFO","time":"2021-02-02T22:18:25.415Z","correlation_id":"01EXJD6JXKA4XSS5SQVC2V19SH","event":"host_online","message":"Host is online after replica status check","db_host":"localhost","db_port":null}
{"severity":"INFO","time":"2021-02-02T22:18:36.351Z","correlation_id":"01EXJD6XHY8NWGR8P1526SYKSS","event":"host_online","message":"Host is online after replica status check","db_host":"localhost","db_port":null}
{"severity":"INFO","time":"2021-02-02T22:23:17.932Z","correlation_id":"01EXJDFCMH6PVVYY614XB35R2M","event":"host_online","message":"Host is online after replica status check","db_host":"localhost","db_port":null}
{"severity":"INFO","time":"2021-02-02T22:23:24.768Z","correlation_id":"01EXJDFKDNXN4JJJKTCCSP34E8","event":"host_online","message":"Host is online after replica status check","db_host":"localhost","db_port":null}

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team

Closes #283971 (closed)

Edited by Gabriel Mazetto

Merge request reports