Zero downtime database switch PoC
What does this MR do?
This MR demonstrates that zero-downtime database switch is possible. Note: It's hacky!
Configuration
-
CiBase
is an abstract AR class, defines the two databases (shards)-
primary
is the current database -
ci
is the new database, data fromprimary
should be moved here. - Note:
ci
shard is not used in the application code at all.
-
-
CiTestTable
is a model for theci_test_table
Database configuration (config/database.yml):
development:
primary:
STANDARD_DEV_GDK_DB_CONFIG_COMES_HERE
ci:
adapter: postgresql
encoding: unicode
database: gitlabhq_ci_development
user: postgres
port: 5432
pool: 20
prepared_statements: false
The new database needs to be created by hand:
create database gitlabhq_ci_development;
Test script
If the new database is present and the configuration is in place, we can invoke the PoC script:
RAILS_PROFILE=true rails runner table_migration_test.rb
Note: RAILS_PROFILE
is important, it doesn't work with code reloading.
What the script does:
- Create the DB table in both databases (primary, ci)
- Insert 10 rows, this will go to primary (default)
- In an infinite loop, run
SELECT
queries - Get an
EXCLUSIVE
lock. This still allows reads but writes are blocked - Copy the table data from
primary
to theci
DB - Fix the sequence for the pkey
- Install a trigger to the
primary
DB to prevent table modification - Release the lock
- Insert 10 records again. These will fail because of the trigger. Catch the error and reconnect to the
ci
DB. - The retry will insert data to the
ci
DB - Verify the data
Problem: the infinite loop that reads rows from primary DB will never switch to the new DB. Triggers are only preventing writes. We'll probably need to revoke permission or rename the table so these statements will also fail and we can retry with the new connection.
Does this MR meet the acceptance criteria?
Conformity
-
I have included changelog trailers, or none are needed. (Does this MR need a changelog?) -
I have added/updated documentation, or it's not needed. (Is documentation required?) -
I have properly separated EE content from FOSS, or this MR is FOSS only. (Where should EE code go?) -
I have added information for database reviewers in the MR description, or it's not needed. (Does this MR have database related changes?) -
I have self-reviewed this MR per code review guidelines. -
This MR does not harm performance, or I have asked a reviewer to help assess the performance impact. (Merge request performance guidelines) -
I have followed the style guides. -
This change is backwards compatible across updates, or this does not apply.
Availability and Testing
-
I have added/updated tests following the Testing Guide, or it's not needed. (Consider all test levels. See the Test Planning Process.) -
I have tested this MR in all supported browsers, or it's not needed. -
I have informed the Infrastructure department of a default or new setting change per definition of done, or it's not needed.
Security
Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.
-
Label as security and @ mention @gitlab-com/gl-security/appsec
-
The MR includes necessary changes to maintain consistency between UI, API, email, or other methods -
Security reports checked/validated by a reviewer from the AppSec team