Wires `gitlab-database-data_isolation` into the Rails monolith.
What does this MR do and why?
Wires gitlab-database-data_isolation into the Rails monolith.
config/initializers/gitlab_database_data_isolation.rb— buildssharding_key_mapfromGitlab::Database::Dictionaryentries for theorganizationskey, setsstrategy: :arel, and resolvescurrent_sharding_key_valuefromCurrent.organization&.id. CallsDataIsolation.install!to activate.config/initializers/0_load_gitlab_database.rb— explicitly requireslib/gitlab/database.rbbefore Bundler loads the gem. The gem opens theGitlab::Databasenamespace duringBundler.require; without this, Zeitwerk sees a pre-defined constant and raises a conflict.
The logic is only enabled for organizations table. Furthermore, the change is behind a feature flag. And only active for Organizations that are marked as isolated: these don't exist yet
Why these spec_helper changes are needed
The data_isolation feature flag (introduced in milestone 19.0) is evaluated on every
query against the organizations table. This happens because the data isolation AR extension
intercepts arel calls on that table and invokes Gitlab::Organizations::Isolation.enabled?,
which in turn calls Feature.enabled?(:data_isolation, ...).
Why logging triggers
Feature::Definition#for_upcoming_milestone? returns true when a flag's milestone is ahead of
the running GitLab version. Because the flag's milestone is 19.0 and the current version is
18.11, this method returns true for data_isolation. As a result, every call to Feature.enabled?(:data_isolation)
is recorded in RequestStore[:feature_flag_events] via Feature#log_feature_flag_state.
How this breaks tests
ExceptionLogFormatter (and ContextPayloadGenerator for Sentry) reads
Feature.logged_states_for_log when formatting an exception. If feature_flag_events is
non-empty, it appends exception.feature_flag_states: ["data_isolation:0"] to the payload.
Because RequestStore uses a plain thread-local (Thread.current[:request_store]) it persists
across the entire test run unless explicitly cleared. Factory setup code — let_it_be blocks
(which run in before(:all)) and let! (which runs as a per-example before hook, after
config.before) — all trigger organization queries, silently populating feature_flag_events
before the test body executes. Any spec that then asserts on an exact exception log payload fails
because of the unexpected exception.feature_flag_states key.
The fix
Two things are done in config.before, before each example:
-
RequestStore.delete(:feature_flag_events)— clears state accumulated bylet_it_be(or any otherbefore(:all)setup) that ran before this hook. -
Stub
Feature.log_feature_flag_states?(:data_isolation)to returnfalse— prevents the flag from being recorded during the example itself, including insidelet!hooks that run afterconfig.before. The precedingand_call_originaldefault ensures all other feature flags continue to log normally.
Together these ensure that data_isolation never appears in exception payloads during specs,
without affecting any other feature flag logging behaviour.
Query plans
We change the way we retrieve the Current Organization. We need to include the isolation state of an organization.
On production, we have not a lot of organizations (128) and NO organization_isolations records.
I created some test organizations using these queries on Postgres.ai
EXEC INSERT INTO organizations (id, created_at, updated_at, name, path, visibility_level)
SELECT
nextval('organizations_id_seq'),
NOW(),
NOW(),
'Seed Organization ' || i,
'seed-org-' || i,
0
FROM generate_series(1, 10000) AS s(i);
EXEC INSERT INTO organization_isolations (id, organization_id, created_at, updated_at, isolated)
SELECT
nextval('organization_isolations_id_seq'),
o.id,
NOW(),
NOW(),
random() < 0.2
FROM (
SELECT id FROM organizations
WHERE name LIKE 'Seed Organization %'
ORDER BY id DESC
LIMIT 10000
) o;
EXEC ANALYZE organizations, organization_isolationsBy Organizaton
::Organizations::Organization.find_by_path_with_isolation('default')
Current query (master)
SELECT "organizations".* FROM "organizations" WHERE "organizations"."path" = 'default' LIMIT 1New query Query Plan
SELECT "organizations"."id" AS t0_r0,
"organizations"."created_at" AS t0_r1,
"organizations"."updated_at" AS t0_r2,
"organizations"."name" AS t0_r3,
"organizations"."path" AS t0_r4,
"organizations"."visibility_level" AS t0_r5,
"organizations"."state" AS t0_r6,
"organization_isolations"."id" AS t1_r0,
"organization_isolations"."organization_id" AS t1_r1,
"organization_isolations"."created_at" AS t1_r2,
"organization_isolations"."updated_at" AS t1_r3,
"organization_isolations"."isolated" AS t1_r4
FROM "organizations"
LEFT OUTER JOIN "organization_isolations"
ON "organization_isolations"."organization_id" =
"organizations"."id"
WHERE "organizations"."path" = 'default'
LIMIT 1 By User
Current query (master)
SELECT "organizations".* FROM "organizations" WHERE "organizations"."id" = 1 LIMIT 1New query Query Plan
SELECT "organizations"."id" AS t0_r0,
"organizations"."created_at" AS t0_r1,
"organizations"."updated_at" AS t0_r2,
"organizations"."name" AS t0_r3,
"organizations"."path" AS t0_r4,
"organizations"."visibility_level" AS t0_r5,
"organizations"."state" AS t0_r6,
"organization_isolations"."id" AS t1_r0,
"organization_isolations"."organization_id" AS t1_r1,
"organization_isolations"."created_at" AS t1_r2,
"organization_isolations"."updated_at" AS t1_r3,
"organization_isolations"."isolated" AS t1_r4
FROM "organizations"
LEFT OUTER JOIN "organization_isolations"
ON "organization_isolations"."organization_id" =
"organizations"."id"
WHERE "organizations"."id" = 1
LIMIT 1 Query plan:
Limit (cost=0.57..6.61 rows=1 width=96)
-> Nested Loop Left Join (cost=0.57..6.61 rows=1 width=96)
-> Index Scan using organizations_pkey on organizations (cost=0.29..3.30 rows=1 width=63)
Index Cond: (id = 1)
-> Index Scan using index_organization_isolations_on_organization_id on organization_isolations (cost=0.29..3.30 rows=1 width=33)
Index Cond: (organization_id = 1)By Group
Current query (master)
SELECT "organizations".*
FROM "organizations"
INNER JOIN "namespaces"
ON "namespaces"."organization_id" = "organizations"."id"
INNER JOIN "routes" "route"
ON "route"."source_type" = 'Namespace'
AND "route"."source_id" = "namespaces"."id"
WHERE "route"."path" = 'twitter'
ORDER BY "organizations"."id" ASC
LIMIT 1 New query query plan
SELECT "organizations"."id" AS t0_r0,
"organizations"."created_at" AS t0_r1,
"organizations"."updated_at" AS t0_r2,
"organizations"."name" AS t0_r3,
"organizations"."path" AS t0_r4,
"organizations"."visibility_level" AS t0_r5,
"organizations"."state" AS t0_r6,
"organization_isolations"."id" AS t1_r0,
"organization_isolations"."organization_id" AS t1_r1,
"organization_isolations"."created_at" AS t1_r2,
"organization_isolations"."updated_at" AS t1_r3,
"organization_isolations"."isolated" AS t1_r4
FROM "organizations"
LEFT OUTER JOIN "organization_isolations"
ON "organization_isolations"."organization_id" =
"organizations"."id"
WHERE "organizations"."id" IN (SELECT "organizations"."id"
FROM "organizations"
INNER JOIN "namespaces"
ON "namespaces"."organization_id"
=
"organizations"."id"
INNER JOIN "routes" "route"
ON "route"."source_type" =
'Namespace'
AND "route"."source_id" =
"namespaces"."id"
WHERE "route"."path" = 'gitlab-org')
ORDER BY "organizations"."id" ASC
LIMIT 1 New query Query plan
Limit (cost=0.57..6.61 rows=1 width=96)
-> Nested Loop Left Join (cost=0.57..6.61 rows=1 width=96)
-> Index Scan using organizations_pkey on organizations (cost=0.29..3.30 rows=1 width=63)
Index Cond: (id = 1)
-> Index Scan using index_organization_isolations_on_organization_id on organization_isolations (cost=0.29..3.30 rows=1 width=33)
Index Cond: (organization_id = 1)References
- Related issue: #582272
- Replaces: Application-Level Database Isolation (!225550 - closed)
- Builts on top: Introduces the `gitlab-database-data_isolation`... (!226036 - merged)
How to set up and validate locally
We can test this by querying for an Organization in a Rails console:
# Additional organizations
my_org = ::Organizations::Organization.find_or_create_by(path: 'my-org-isolated') { |o| o.name ='My Isolated Organization' }
# Mark my_org as isolated
my_org.mark_as_isolated!
# Enable feature flag
Feature.enable(:data_isolation)
# Assign Current.organization
Current.organization = my_org
# This will have an additional where clause
puts ::Organizations::Organization.where(path: 'test').to_sqlMR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.