Skip to content

Extending the scope of gitlab_schemas to be locked on multiple dbs

Omar Qunsul requested to merge 413543-locking-tables-other-gitlab-schemas into master

What does this MR do and why?

Extending the scope of gitlab_schemas to be locked on multiple dbs

Addressing: #413543 (closed) Changelog: added

More context about locking tables

As part of the database decomposition project we made a copy of the database on a new cluster. We used the first database cluster, called main for all the database tables that belong to the gitlab_main gitlab schema. And used the new database cluster ci for all the tables that belong to the gitlab_schema gitlab_ci. While the gitlab_shared tables are still used on both clusters

After the decomposition we use gitlab_main and gitlab_shared tables on the database cluster main. And we use the gitlab_ci and gitlab_shared tables on the database cluster ci. But the database schema (not to be confused with gitlab_schema) is still the same on both database clusters. We are keeping it now like this for simplicity. That means we still have some legacy tables, that we have been truncated recently, that the application in theory still can write/read from, but it should not. These tables should be ignored. We decided to lock_writes to them, to detect with errors early on that the application is writing to the wrong database.

For example, the application should not write to the users table on the ci database, even though it exists. And the same applies to the ci_builds table on the main database. We should get an error instead.

As part of the MR we introduced automatic locking on newly created tables. So that they are locked for writes upon creation on the database where they are not supposed to be written to or read from.

This MR extends the scope of locking writes into more gitlab_schemas that have been introduced recently, especially in the context of work on the cells project.

How to set up and validate locally

Testing on Multiple Databases

  • Make sure you are running locally with decomposed database mode (main, ci). Where the ci database is not shared with main. Your config/database.yml should look like the following:
development:
  main: &development_main
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development
    host: /Users/omar/gdk/postgresql
  ci:
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development_ci # This has to be a different database
    database_tasks: true # This should not be false. But it can be omitted. The default is true
  • Create a database migration that looks like this one
    create_table(:_test_gitlab_main_table) do |t|
      t.string :name
    end

    create_table(:_test_gitlab_ci_table) do |t|
      t.string :name
    end

    create_table(:_test_gitlab_main_cell_table) do |t|
      t.string :name
    end

    create_table(:_test_gitlab_main_clusterwide_table) do |t|
      t.string :name
    end

    create_table(:_test_gitlab_pm_table) do |t|
      t.string :name
    end
  • Run rails db:migrate. You should see these lines in the output:
I, [2023-07-11T16:10:00.262434 #66599]  INFO -- : Database: 'main', Table: '_test_gitlab_ci_table': Lock Writes
....
I, [2023-07-11T16:10:00.359357 #66599]  INFO -- : Database: 'ci', Table: '_test_gitlab_main_clusterwide_table': Lock Writes
I, [2023-07-11T16:10:00.360364 #66599]  INFO -- : Database: 'ci', Table: '_test_gitlab_pm_table': Lock Writes
I, [2023-07-11T16:10:00.360860 #66599]  INFO -- : Database: 'ci', Table: '_test_gitlab_main_table': Lock Writes
I, [2023-07-11T16:10:00.361421 #66599]  INFO -- : Database: 'ci', Table: '_test_gitlab_main_cell_table': Lock Writes

Which means that these tables have been locked for writes on the corresponding databases where they are not used. You can validate by connecting to these databases and run DELETE FROM <TABLE_NAME>. For example, running this on the ci database you should see:

gitlabhq_development_ci=# delete from _test_gitlab_pm_table;
ERROR:  Table: "_test_gitlab_pm_table" is write protected within this Gitlab database.
HINT:  Make sure you are using the right database connection
CONTEXT:  PL/pgSQL function gitlab_schema_prevent_write() line 4 at RAISE

But this table should not be locked on the gitlabhq_development (main) database:

gitlabhq_development=# delete from _test_gitlab_pm_table;
DELETE 0

Testing on Single Database with two connections

  • Make sure you are running on single database two connections mode (which is the default for self-managed). Your config/database.yml should look like this
development:
  main: &development_main
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development
    host: /Users/omar/gdk/postgresql
  ci:
    adapter: postgresql
    encoding: unicode
    database: gitlabhq_development # This has to be the same as the main database
    database_tasks: false # This should be false, to indicate that we are sharing the same database with the main, but with a different connection
  • Create a database migration that looks like the one in the previous section.

  • Run the migration. No tables should be locked on the main database. Especially _test_gitlab_ci_table. You can validate by connecting to the database and run the following:

gitlabhq_development=# delete from _test_gitlab_ci_table;
DELETE 0

No error should be raised

Clean up

  • Make sure that you rollback the migrations after you finish testing

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #413543 (closed)

Edited by Omar Qunsul

Merge request reports