Skip to content
Snippets Groups Projects
Verified Commit b3c57cd1 authored by Douglas Barbosa Alexandre's avatar Douglas Barbosa Alexandre
Browse files

Introduce `gitlab_internal` for Rails and pg tables

The `gitlab_shared` includes application and internals.
This is problematic to support for other databases that
might be missing some of those features (Geo), but are
valid Rails databases.

Introduce `gitlab_internal` to support such.

Changelog: added
parent e2f4df7b
No related branches found
No related tags found
1 merge request!88704Introduce `gitlab_internal` for Rails and pg tables
Showing
with 72 additions and 38 deletions
......@@ -23,7 +23,8 @@ Each table of GitLab needs to have a `gitlab_schema` assigned:
- `gitlab_main`: describes all tables that are being stored in the `main:` database (for example, like `projects`, `users`).
- `gitlab_ci`: describes all CI tables that are being stored in the `ci:` database (for example, `ci_pipelines`, `ci_builds`).
- `gitlab_shared`: describe all application tables that contain data across all decomposed databases (for example, `loose_foreign_keys_deleted_records`).
- `gitlab_shared`: describe all application tables that contain data across all decomposed databases (for example, `loose_foreign_keys_deleted_records`) for models that inherit from `Gitlab::Database::SharedModel`.
- `gitlab_internal`: describe all internal tables of Rails and PostgreSQL (for example, `ar_internal_metadata`, `schema_migrations`, `pg_*`).
- `...`: more schemas to be introduced with additional decomposed databases
The usage of schema enforces the base class to be used:
......@@ -44,10 +45,8 @@ This is used as a primary source of classification for:
### The special purpose of `gitlab_shared`
`gitlab_shared` is a special case describing tables or views that by design contain data across
all decomposed databases. This does describe application-defined tables (like `loose_foreign_keys_deleted_records`),
Rails-defined tables (like `schema_migrations` or `ar_internal_metadata` as well as internal PostgreSQL tables
(for example, `pg_attribute`).
`gitlab_shared` is a special case that describes tables or views that, by design, contain data across
all decomposed databases. This classification describes application-defined tables (like `loose_foreign_keys_deleted_records`).
**Be careful** to use `gitlab_shared` as it requires special handling while accessing data.
Since `gitlab_shared` shares not only structure but also data, the application needs to be written in a way
......@@ -62,6 +61,11 @@ end
As such, migrations modifying data of `gitlab_shared` tables are expected to run across
all decomposed databases.
### The special purpose of `gitlab_internal`
`gitlab_internal` describes Rails-defined tables (like `schema_migrations` or `ar_internal_metadata`), as well as internal PostgreSQL tables (for example, `pg_attribute`). Its primary purpose is to [support other databases](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/85842#note_943453682), like Geo, that
might be missing some of those application-defined `gitlab_shared` tables (like `loose_foreign_keys_deleted_records`), but are valid Rails databases.
## Migrations
Read [Migrations for Multiple Databases](migrations_for_multiple_databases.md).
......
......@@ -68,7 +68,8 @@ def self.schemas_to_base_models
@schemas_to_base_models ||= {
gitlab_main: [self.database_base_models.fetch(:main)],
gitlab_ci: [self.database_base_models[:ci] || self.database_base_models.fetch(:main)], # use CI or fallback to main
gitlab_shared: self.database_base_models.values # all models
gitlab_shared: self.database_base_models.values, # all models
gitlab_internal: self.database_base_models.values # all models
}.with_indifferent_access.freeze
end
......
......@@ -75,8 +75,8 @@ def self.table_schema(name)
return gitlab_schema
end
# All tables from `information_schema.` are `:gitlab_shared`
return :gitlab_shared if schema_name == 'information_schema'
# All tables from `information_schema.` are marked as `internal`
return :gitlab_internal if schema_name == 'information_schema'
return :gitlab_main if table_name.start_with?('_test_gitlab_main_')
......@@ -85,8 +85,8 @@ def self.table_schema(name)
# All tables that start with `_test_` without a following schema are shared and ignored
return :gitlab_shared if table_name.start_with?('_test_')
# All `pg_` tables are marked as `shared`
return :gitlab_shared if table_name.start_with?('pg_')
# All `pg_` tables are marked as `internal`
return :gitlab_internal if table_name.start_with?('pg_')
# When undefined it's best to return a unique name so that we don't incorrectly assume that 2 undefined schemas belong on the same database
:"undefined_#{table_name}"
......
......@@ -35,7 +35,7 @@ approval_project_rules_users: :gitlab_main
approvals: :gitlab_main
approver_groups: :gitlab_main
approvers: :gitlab_main
ar_internal_metadata: :gitlab_shared
ar_internal_metadata: :gitlab_internal
atlassian_identities: :gitlab_main
audit_events_external_audit_event_destinations: :gitlab_main
audit_events: :gitlab_main
......@@ -467,7 +467,7 @@ routes: :gitlab_main
saml_group_links: :gitlab_main
saml_providers: :gitlab_main
saved_replies: :gitlab_main
schema_migrations: :gitlab_shared
schema_migrations: :gitlab_internal
scim_identities: :gitlab_main
scim_oauth_access_tokens: :gitlab_main
security_findings: :gitlab_main
......
......@@ -9,7 +9,7 @@ class RestrictAllowedSchemas < Base
DMLNotAllowedError = Class.new(UnsupportedSchemaError)
DMLAccessDeniedError = Class.new(UnsupportedSchemaError)
IGNORED_SCHEMAS = %i[gitlab_shared].freeze
IGNORED_SCHEMAS = %i[gitlab_shared gitlab_internal].freeze
class << self
def enabled?
......
......@@ -20,6 +20,15 @@ def using_connection(connection)
"to '#{Gitlab::Database.db_config_name(connection)}'"
end
# connection might not be yet adopted (returning nil, and no gitlab_schemas)
# in such cases it is fine to ignore such connections
gitlab_schemas = Gitlab::Database.gitlab_schemas_for_connection(connection)
unless gitlab_schemas.nil? || gitlab_schemas.include?(:gitlab_shared)
raise "Cannot set `SharedModel` to connection from `#{Gitlab::Database.db_config_name(connection)}` " \
"since this connection does not include `:gitlab_shared` schema."
end
self.overriding_connection = connection
yield
......
......@@ -348,7 +348,13 @@ namespace :gitlab do
Rake::Task['db:drop'].invoke
Rake::Task['db:create'].invoke
ActiveRecord::Base.configurations.configs_for(env_name: ActiveRecord::Tasks::DatabaseTasks.env).each do |db_config|
ActiveRecord::Base.establish_connection(db_config.configuration_hash.merge(username: username)) # rubocop: disable Database/EstablishConnection
config = ActiveRecord::DatabaseConfigurations::HashConfig.new(
db_config.env_name,
db_config.name,
db_config.configuration_hash.merge(username: username)
)
ActiveRecord::Base.establish_connection(config) # rubocop: disable Database/EstablishConnection
Gitlab::Database.check_for_non_superuser
Rake::Task['db:migrate'].invoke
end
......
......@@ -61,7 +61,11 @@
context 'when shared connections are not included' do
it 'only yields the unshared connections' do
expect(Gitlab::Database).to receive(:db_config_share_with).twice.and_return(nil, 'main')
if Gitlab::Database.has_config?(:ci)
expect(Gitlab::Database).to receive(:db_config_share_with).exactly(3).times.and_return(nil, 'main', 'main')
else
expect(Gitlab::Database).to receive(:db_config_share_with).twice.and_return(nil, 'main')
end
expect { |b| described_class.each_database_connection(include_shared: false, &b) }
.to yield_successive_args([ActiveRecord::Base.connection, 'main'])
......
......@@ -7,7 +7,7 @@
it 'all tables have assigned a known gitlab_schema' do
is_expected.to all(
match([be_a(String), be_in([:gitlab_shared, :gitlab_main, :gitlab_ci])])
match([be_a(String), be_in([:gitlab_internal, :gitlab_shared, :gitlab_main, :gitlab_ci])])
)
end
......@@ -42,12 +42,12 @@
where(:name, :classification) do
'ci_builds' | :gitlab_ci
'my_schema.ci_builds' | :gitlab_ci
'information_schema.columns' | :gitlab_shared
'information_schema.columns' | :gitlab_internal
'audit_events_part_5fc467ac26' | :gitlab_main
'_test_gitlab_main_table' | :gitlab_main
'_test_gitlab_ci_table' | :gitlab_ci
'_test_my_table' | :gitlab_shared
'pg_attribute' | :gitlab_shared
'pg_attribute' | :gitlab_internal
'my_other_table' | :undefined_my_other_table
end
......
......@@ -27,6 +27,19 @@
end
end
it 'raises an error if the connection does not include `:gitlab_shared` schema' do
allow(Gitlab::Database)
.to receive(:gitlab_schemas_for_connection)
.with(new_connection)
.and_return([:gitlab_main])
expect_original_connection_around do
expect do
described_class.using_connection(new_connection) {}
end.to raise_error(/Cannot set `SharedModel` to connection/)
end
end
context 'when multiple connection overrides are nested', :aggregate_failures do
let(:second_connection) { double('connection') }
......
......@@ -24,7 +24,7 @@
describe '.quiet' do
let(:database_base_models) do
{
main: ApplicationRecord,
main: ActiveRecord::Base,
ci: Ci::ApplicationRecord
}
end
......
......@@ -40,15 +40,17 @@
describe 'GET #index' do
let(:default_model) { ActiveRecord::Base }
let(:db_config) { instance_double(ActiveRecord::DatabaseConfigurations::HashConfig, name: 'fake_db') }
before do
allow(Gitlab::Database).to receive(:db_config_for_connection).and_return(db_config)
allow(Gitlab::Database).to receive(:database_base_models).and_return(base_models)
end
let!(:main_database_migration) { create(:batched_background_migration, :active) }
context 'when no database is provided' do
let(:base_models) { { 'fake_db' => default_model } }
let(:base_models) { { 'fake_db' => default_model }.with_indifferent_access }
before do
stub_const('Gitlab::Database::MAIN_DATABASE_NAME', 'fake_db')
......@@ -68,7 +70,7 @@
end
context 'when multiple database is enabled', :add_ci_connection do
let(:base_models) { { 'fake_db' => default_model, 'ci' => ci_model } }
let(:base_models) { { 'fake_db' => default_model, 'ci' => ci_model }.with_indifferent_access }
let(:ci_model) { Ci::ApplicationRecord }
context 'when CI database is provided' do
......
......@@ -42,7 +42,7 @@
end
context 'when multiple database is enabled', :add_ci_connection do
let(:base_models) { { 'fake_db' => default_model, 'ci' => ci_model } }
let(:base_models) { { 'main' => default_model, 'ci' => ci_model }.with_indifferent_access }
let(:ci_model) { Ci::ApplicationRecord }
before do
......
......@@ -112,7 +112,7 @@
let(:main_database_name) { Gitlab::Database::MAIN_DATABASE_NAME }
let(:model) { Gitlab::Database.database_base_models[main_database_name] }
let(:connection) { double(:connection) }
let(:base_models) { { 'main' => model } }
let(:base_models) { { 'main' => model }.with_indifferent_access }
around do |example|
Gitlab::Database::SharedModel.using_connection(model.connection) do
......
......@@ -822,18 +822,20 @@ def expect_objects_to_be_dropped(connection)
let(:connection_pool) { instance_double(ActiveRecord::ConnectionAdapters::ConnectionPool ) }
let(:connection) { instance_double(ActiveRecord::ConnectionAdapters::PostgreSQLAdapter) }
let(:configurations) { double(ActiveRecord::DatabaseConfigurations) }
let(:configuration) { instance_double(ActiveRecord::DatabaseConfigurations::HashConfig) }
let(:configuration) { instance_double(ActiveRecord::DatabaseConfigurations::HashConfig, env_name: 'test', name: 'main') }
let(:config_hash) { { username: 'foo' } }
it 'migrate as nonsuperuser check with default username' do
before do
allow(Rake::Task['db:drop']).to receive(:invoke)
allow(Rake::Task['db:create']).to receive(:invoke)
allow(ActiveRecord::Base).to receive(:configurations).and_return(configurations)
allow(configurations).to receive(:configs_for).and_return([configuration])
allow(configuration).to receive(:configuration_hash).and_return(config_hash)
allow(ActiveRecord::Base).to receive(:establish_connection).and_return(connection_pool)
end
expect(config_hash).to receive(:merge).with({ username: 'gitlab' })
it 'migrate as nonsuperuser check with default username' do
expect(config_hash).to receive(:merge).with({ username: 'gitlab' }).and_call_original
expect(Gitlab::Database).to receive(:check_for_non_superuser)
expect(Rake::Task['db:migrate']).to receive(:invoke)
......@@ -841,14 +843,7 @@ def expect_objects_to_be_dropped(connection)
end
it 'migrate as nonsuperuser check with specified username' do
allow(Rake::Task['db:drop']).to receive(:invoke)
allow(Rake::Task['db:create']).to receive(:invoke)
allow(ActiveRecord::Base).to receive(:configurations).and_return(configurations)
allow(configurations).to receive(:configs_for).and_return([configuration])
allow(configuration).to receive(:configuration_hash).and_return(config_hash)
allow(ActiveRecord::Base).to receive(:establish_connection).and_return(connection_pool)
expect(config_hash).to receive(:merge).with({ username: 'foo' })
expect(config_hash).to receive(:merge).with({ username: 'foo' }).and_call_original
expect(Gitlab::Database).to receive(:check_for_non_superuser)
expect(Rake::Task['db:migrate']).to receive(:invoke)
......
......@@ -157,10 +157,10 @@ def count_deletable_rows
describe 'multi-database support' do
where(:current_minute, :configured_base_models, :expected_connection_model) do
2 | { main: 'ApplicationRecord', ci: 'Ci::ApplicationRecord' } | 'ApplicationRecord'
3 | { main: 'ApplicationRecord', ci: 'Ci::ApplicationRecord' } | 'Ci::ApplicationRecord'
2 | { main: 'ApplicationRecord' } | 'ApplicationRecord'
3 | { main: 'ApplicationRecord' } | 'ApplicationRecord'
2 | { main: 'ActiveRecord::Base', ci: 'Ci::ApplicationRecord' } | 'ActiveRecord::Base'
3 | { main: 'ActiveRecord::Base', ci: 'Ci::ApplicationRecord' } | 'Ci::ApplicationRecord'
2 | { main: 'ActiveRecord::Base' } | 'ActiveRecord::Base'
3 | { main: 'ActiveRecord::Base' } | 'ActiveRecord::Base'
end
with_them do
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment