Skip to content

Introducing CrossDatabaseIgnoredTables helper

Omar Qunsul requested to merge 409836-cross-database-ignored-tables into master

What does this MR do and why?

Introducing CrossDatabaseIgnoredTables helper.

This helper will allow us to allow and identify some cross database modifications between different gitlab_schema tables, until we resolve them later.

Addressing: #409836 (closed)

More context about the problem

As part of the cells project, we are going to decompose the database further into databases, and divide some gitlab_schemas like gitlab_main into more further gitlab_schemas gitlab_main_clusterwide and gitlab_main_cell to prepare gitlab_main tables to be split into different databases in the future. That would look something like this:

flowchart TB
    subgraph Cell 1
      subgraph Main Database
        A["(gitlab_main_clusterwide) tables. For example: users, application_settings, organizations"]
        B["(gitlab_main_cells) tables. For example: namespaces, projects"]
      end
    end
    subgraph Cell 2
      subgraph Main Database
        B2["(gitlab_main_cells) tables. For example: namespaces, projects"]
      end
    end

    A -- We should remove all cross database modifications --- B
    A -- We should remove all cross database modifications --- B2

The problem with splitting database tables into different databases is that Database Transactions that used to guarantee atomicity on one database, don't work anymore across different database. Which means that we should not modify tables that belong to different gitlab schemas within single transactions. So something like this is not allowed anymore

begin transaction1 that modifies gitlab_main_clusterwide
begin transaction2 that modifies gitlab_main_cell
.....
commit transaction2
commit transaction1

What does the helper that is introduced in this MR do?

While we are splitting our tables from gitlab_main to gitlab_main_cell and gitlab_main_clusterwide we will find some cross database violations caused by associations that are modified upon creating/deleting/updating association records. Something in this format

class Model
  has_one :child_record # that belongs to a another gitlab_schema table

  before_create :build_child_record # it calls some methods to create this association so that it's created along with the parent model
end

Which causes the PreventCrossDatabaseModification Database Query Analyzer to raise an error in dev/testing environment.

The helper introduced in this MR helps us to identify those association based cross database modification, and temporarily allow them while pointing to an issue that will help us to resolve them later. The helper can be used in this way

class Model
  include CrossDatabaseIgnoredTables
  has_one :child_record # that belongs to a another gitlab_schema table

  before_create :build_child_record # it calls some methods to create this association so that it's created along with the parent model
  cross_database_ignore_tables "<child_model_table_name>", on: %I(create), url: "a link to the issue to resolve this cross database violations"
end

This helper makes use of the temporary_ignore_tables_in_transaction method that is provided by the same query analyzer.

How to test this helper locally

  1. Check whether the Query Analyzer is enabled in your local environment
Feature.enabled?(:detect_cross_database_modification, type: :ops)

If it's disabled, please enable it Feature.enable(:detect_cross_database_modification)

  1. Create a Rails script file, for example ~/test_cross_database_modification.rb that contains this script
Click to expand
Ci::ApplicationRecord.connection.execute(
    'CREATE TABLE IF NOT EXISTS _test_gitlab_ci_items(
      id BIGSERIAL PRIMARY KEY, main_model_id INTEGER, updated_at timestamp without time zone
    )'
)
ApplicationRecord.connection.execute(
    'CREATE TABLE IF NOT EXISTS _test_gitlab_main_items(
      id BIGSERIAL PRIMARY KEY, name character varying NOT NULL, updated_at timestamp without time zone
    )'
)

## Rails snippet

class CiModel < Ci::ApplicationRecord
  self.table_name = '_test_gitlab_ci_items'

  belongs_to :main_model_object, class_name: 'MainModel',
    inverse_of: 'ci_model_object', foreign_key: 'main_model_id'

  before_save :save_ci

  def save_ci
    puts "Save CI"
  end
end

class MainModel < ApplicationRecord
  include CrossDatabaseIgnoredTables
  self.table_name = '_test_gitlab_main_items'
      
  has_one :ci_model_object, autosave: true, class_name: 'CiModel',
      inverse_of: 'main_model_object', foreign_key: 'main_model_id',
      dependent: :nullify, touch: true

  before_create :prepare_ci_model_object
  before_commit :commit_main

  def commit_main
    puts "Commit Main"
  end

  def prepare_ci_model_object
      build_ci_model_object
  end

  # cross_database_ignore_tables CiModel.table_name, on: %I(create update), url: "TODO"
end

instance = Gitlab::Database::QueryAnalyzer.instance
Gitlab::Database::QueryAnalyzer.instance.begin!(instance.all_analyzers)
puts Gitlab::Database::QueryAnalyzer.instance.send(:enabled_analyzers).inspect 

puts CiModel.count
puts MainModel.count

# Testing Create
object = MainModel.new(name: 'test')
object.save

puts CiModel.count
puts MainModel.count

# Testing Update
object.update(updated_at: Time.now)
  1. Run this script using the command rails runner ~/test_cross_database_modification.rb. You should get the cross database modification error. Something like this:
/Users/omar/gdk/gitlab/lib/gitlab/database/query_analyzers/prevent_cross_database_modification.rb:124:in `analyze': Cross-database data modification of 'gitlab_main, gitlab_ci' were detected within a transaction modifying the '_test_gitlab_main_items, _test_gitlab_ci_items' tables.  (Gitlab::Database::QueryAnalyzers::PreventCrossDatabaseModification::CrossDatabaseModificationAcrossUnsupportedTablesError)

Please refer to https://docs.gitlab.com/ee/development/database/multiple_databases.html#removing-cross-database-transactions for details on how to resolve this exception.

0: INSERT INTO "_test_gitlab_main_items" ("name", "updated_at") VALUES ($1, $2) RETURNING "id" /*application:web,db_config_name:main,line:/Users/omar/test_cross_database_modification.rb:58:in `<main>'*/

1: INSERT INTO "_test_gitlab_ci_items" ("main_model_id", "updated_at") VALUES ($1, $2) RETURNING "id" /*application:web,db_config_name:ci,line:/Users/omar/test_cross_database_modification.rb:58:in `<main>'*/
  1. Uncomment the line cross_database_ignore_tables CiModel.table_name, on: %I(create update), url: "TODO"

  2. Run the script again. It should work as expected now and give some output that indicates the number of the records in both tables.

  3. Clean up the test tables from the database:

gdk psql -d gitlabhq_development -c 'DROP TABLE _test_gitlab_main_items'
gdk psql -d gitlabhq_development_ci -c 'DROP TABLE _test_gitlab_ci_items'
  1. If the FF was disabled at the beginning (see step 1), please disable it again, or keep it enabled if you want
Feature.disable(:detect_cross_database_modification)

Important Notice

Don't test this with FactoryBot in the Rails Console. Because FactoryBot creates unnecessary transactions that are already ignored in test environment by this PreventCrossDatabaseModification Query Analyzer. See here: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/query_analyzers/prevent_cross_database_modification.rb#L201-207

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #409836 (closed)

Edited by Omar Qunsul

Merge request reports