Skip to content

MonitorLockedTablesWorker to automatically lock tables

Omar Qunsul requested to merge 413656-automatically-locking-unlocked-tables into master

What does this MR do and why?

As part of the database decomposition project we made a copy of the database on a new cluster. We used the first database cluster, called main for all the database tables that belong to the gitlab_main gitlab schema. And used the new database cluster ci for all the tables that belong to the gitlab_schema gitlab_ci. While the gitlab_shared tables are still used on both clusters

After the decomposition we use gitlab_main and gitlab_shared tables on the database cluster main. And we use the gitlab_ci and gitlab_shared tables on the database cluster ci. But the database schema (not to be confused with gitlab_schema) is still the same on both database clusters. We are keeping it now like this for simplicity. That means we still have some legacy tables, that we have been truncated, that the application in theory still can write/read from, but it should not. These tables should be ignored. We decided to lock_writes to them, to detect with errors early on that the application is writing to the wrong database.

For example, the application should not write to the users table on the ci database, even though it exists. And the same applies to the ci_builds table on the main database. We should get an error instead.

We introduced automatic locking of writes on newly introduced tables, via database migrations, in the MR, but due to some issues, some tables were not locked for writes as they are supposed to be. We solved most of these scenarios in the MR

We introduced monitoring on locked tables in the MR and that's how we knew about these legacy tables that are not locked for writes. In this MR we are extending the scope of this Monitoring functionality to lock these tables that are found not be locked already.

Addressing: #413656 (closed) Changelog: fixed

How to set up and validate locally

  1. Make sure you are running with two databases main and ci. Your config/database.yml should look like this with main and ci pointing to two different databases. That's the default setup of GDK.
development:
  main: &development_main
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development
    database_tasks: true
  ci:
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development_ci
    database_tasks: true
  1. Make sure Sidekiq is working and running the newest code on this branch
gdk restart rails-background-jobs
  1. Lock all the legacy tables for writes by running the rake task:
rake gitlab:db:lock_writes
  1. Unlock writes on 2 tables on main and 2 tables on ci by running this script in Rails Console:
%w[users projects].each do |table_name|
  Gitlab::Database::LockWritesManager.new(
    table_name: table_name,
    connection: Ci::ApplicationRecord.connection,
    database_name: 'ci',
    with_retries: true
  ).unlock_writes
end

%w[ci_pipelines ci_builds].each do |table_name|
  Gitlab::Database::LockWritesManager.new(
    table_name: table_name,
    connection: ApplicationRecord.connection,
    database_name: 'main',
    with_retries: true
  ).unlock_writes
end
  1. Make sure we can delete from the unlocked tables. Please double check you are running on two databases, otherwise this might cause deletion of data.
gdk psql -d gitlabhq_development -c 'delete from ci_pipelines'
gdk psql -d gitlabhq_development -c 'delete from ci_builds'
gdk psql -d gitlabhq_development_ci -c 'delete from users'
gdk psql -d gitlabhq_development_ci -c 'delete from projects'
  1. Using the Rails console, enable the two feature flags, and run the `MonitorLockedTablesWorker
Feature.enable(:monitor_database_locked_tables)
Feature.enable(:lock_tables_in_monitoring)
Database::MonitorLockedTablesWorker.perform_async
  1. After waiting for a around 10 seconds, try to run the DELETE FROM statements in step (5). Now you should get errors like this, which is the expected outcome.
ERROR:  Table: "<TABLE_NAME>" is write protected within this Gitlab database.
HINT:  Make sure you are using the right database connection
CONTEXT:  PL/pgSQL function gitlab_schema_prevent_write() line 4 at RAISE

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #413656 (closed)

Edited by Omar Qunsul

Merge request reports