Wires `gitlab-database-data_isolation` into the Rails monolith.

What does this MR do and why?

Wires gitlab-database-data_isolation into the Rails monolith.

  • config/initializers/gitlab_database_data_isolation.rb — builds sharding_key_map from Gitlab::Database::Dictionary entries for the organizations key, sets strategy: :arel, and resolves current_sharding_key_value from Current.organization&.id. Calls DataIsolation.install! to activate.
  • config/initializers/0_load_gitlab_database.rb — explicitly requires lib/gitlab/database.rb before Bundler loads the gem. The gem opens the Gitlab::Database namespace during Bundler.require; without this, Zeitwerk sees a pre-defined constant and raises a conflict.

The logic is only enabled for organizations table. Furthermore, the change is behind a feature flag. And only active for Organizations that are marked as isolated: these don't exist yet

Why these spec_helper changes are needed

The data_isolation feature flag (introduced in milestone 19.0) is evaluated on every query against the organizations table. This happens because the data isolation AR extension intercepts arel calls on that table and invokes Gitlab::Organizations::Isolation.enabled?, which in turn calls Feature.enabled?(:data_isolation, ...).

Why logging triggers

Feature::Definition#for_upcoming_milestone? returns true when a flag's milestone is ahead of the running GitLab version. Because the flag's milestone is 19.0 and the current version is 18.11, this method returns true for data_isolation. As a result, every call to Feature.enabled?(:data_isolation) is recorded in RequestStore[:feature_flag_events] via Feature#log_feature_flag_state.

How this breaks tests

ExceptionLogFormatter (and ContextPayloadGenerator for Sentry) reads Feature.logged_states_for_log when formatting an exception. If feature_flag_events is non-empty, it appends exception.feature_flag_states: ["data_isolation:0"] to the payload.

Because RequestStore uses a plain thread-local (Thread.current[:request_store]) it persists across the entire test run unless explicitly cleared. Factory setup code — let_it_be blocks (which run in before(:all)) and let! (which runs as a per-example before hook, after config.before) — all trigger organization queries, silently populating feature_flag_events before the test body executes. Any spec that then asserts on an exact exception log payload fails because of the unexpected exception.feature_flag_states key.

The fix

Two things are done in config.before, before each example:

  1. RequestStore.delete(:feature_flag_events) — clears state accumulated by let_it_be (or any other before(:all) setup) that ran before this hook.

  2. Stub Feature.log_feature_flag_states?(:data_isolation) to return false — prevents the flag from being recorded during the example itself, including inside let! hooks that run after config.before. The preceding and_call_original default ensures all other feature flags continue to log normally.

Together these ensure that data_isolation never appears in exception payloads during specs, without affecting any other feature flag logging behaviour.

Query plans

We change the way we retrieve the Current Organization. We need to include the isolation state of an organization.

On production, we have not a lot of organizations (128) and NO organization_isolations records.

I created some test organizations using these queries on Postgres.ai

EXEC INSERT INTO organizations (id, created_at, updated_at, name, path, visibility_level)
  SELECT
    nextval('organizations_id_seq'),
    NOW(),
    NOW(),
    'Seed Organization ' || i,
    'seed-org-' || i,
    0
  FROM generate_series(1, 10000) AS s(i);

EXEC INSERT INTO organization_isolations (id, organization_id, created_at, updated_at, isolated)
  SELECT
    nextval('organization_isolations_id_seq'),
    o.id,
    NOW(),
    NOW(),
    random() < 0.2
  FROM (
    SELECT id FROM organizations
    WHERE name LIKE 'Seed Organization %'
    ORDER BY id DESC
    LIMIT 10000
  ) o;

EXEC ANALYZE organizations, organization_isolations

By Organizaton

::Organizations::Organization.find_by_path_with_isolation('default')

Current query (master)

SELECT "organizations".* FROM "organizations" WHERE "organizations"."path" = 'default' LIMIT 1

New query Query Plan

SELECT "organizations"."id"                        AS t0_r0,
       "organizations"."created_at"                AS t0_r1,
       "organizations"."updated_at"                AS t0_r2,
       "organizations"."name"                      AS t0_r3,
       "organizations"."path"                      AS t0_r4,
       "organizations"."visibility_level"          AS t0_r5,
       "organizations"."state"                     AS t0_r6,
       "organization_isolations"."id"              AS t1_r0,
       "organization_isolations"."organization_id" AS t1_r1,
       "organization_isolations"."created_at"      AS t1_r2,
       "organization_isolations"."updated_at"      AS t1_r3,
       "organization_isolations"."isolated"        AS t1_r4
FROM   "organizations"
       LEFT OUTER JOIN "organization_isolations"
                    ON "organization_isolations"."organization_id" =
                       "organizations"."id"
WHERE  "organizations"."path" = 'default'
LIMIT  1 

By User

Current query (master)

SELECT "organizations".* FROM "organizations" WHERE "organizations"."id" = 1 LIMIT 1

New query Query Plan

SELECT "organizations"."id"                        AS t0_r0,
       "organizations"."created_at"                AS t0_r1,
       "organizations"."updated_at"                AS t0_r2,
       "organizations"."name"                      AS t0_r3,
       "organizations"."path"                      AS t0_r4,
       "organizations"."visibility_level"          AS t0_r5,
       "organizations"."state"                     AS t0_r6,
       "organization_isolations"."id"              AS t1_r0,
       "organization_isolations"."organization_id" AS t1_r1,
       "organization_isolations"."created_at"      AS t1_r2,
       "organization_isolations"."updated_at"      AS t1_r3,
       "organization_isolations"."isolated"        AS t1_r4
FROM   "organizations"
       LEFT OUTER JOIN "organization_isolations"
                    ON "organization_isolations"."organization_id" =
                       "organizations"."id"
WHERE  "organizations"."id" = 1
LIMIT  1 

Query plan:

 Limit  (cost=0.57..6.61 rows=1 width=96)
   ->  Nested Loop Left Join  (cost=0.57..6.61 rows=1 width=96)
         ->  Index Scan using organizations_pkey on organizations  (cost=0.29..3.30 rows=1 width=63)
               Index Cond: (id = 1)
         ->  Index Scan using index_organization_isolations_on_organization_id on organization_isolations  (cost=0.29..3.30 rows=1 width=33)
               Index Cond: (organization_id = 1)

By Group

Current query (master)

SELECT "organizations".*
FROM   "organizations"
       INNER JOIN "namespaces"
               ON "namespaces"."organization_id" = "organizations"."id"
       INNER JOIN "routes" "route"
               ON "route"."source_type" = 'Namespace'
                  AND "route"."source_id" = "namespaces"."id"
WHERE  "route"."path" = 'twitter'
ORDER  BY "organizations"."id" ASC
LIMIT  1 

New query query plan

SELECT "organizations"."id"                        AS t0_r0,
       "organizations"."created_at"                AS t0_r1,
       "organizations"."updated_at"                AS t0_r2,
       "organizations"."name"                      AS t0_r3,
       "organizations"."path"                      AS t0_r4,
       "organizations"."visibility_level"          AS t0_r5,
       "organizations"."state"                     AS t0_r6,
       "organization_isolations"."id"              AS t1_r0,
       "organization_isolations"."organization_id" AS t1_r1,
       "organization_isolations"."created_at"      AS t1_r2,
       "organization_isolations"."updated_at"      AS t1_r3,
       "organization_isolations"."isolated"        AS t1_r4
FROM   "organizations"
       LEFT OUTER JOIN "organization_isolations"
                    ON "organization_isolations"."organization_id" =
                       "organizations"."id"
WHERE  "organizations"."id" IN (SELECT "organizations"."id"
                                FROM   "organizations"
                                       INNER JOIN "namespaces"
                                               ON "namespaces"."organization_id"
                                                  =
                                                  "organizations"."id"
                                       INNER JOIN "routes" "route"
                                               ON "route"."source_type" =
                                                  'Namespace'
                                                  AND "route"."source_id" =
                                                      "namespaces"."id"
                                WHERE  "route"."path" = 'gitlab-org')
ORDER  BY "organizations"."id" ASC
LIMIT  1 

New query Query plan

 Limit  (cost=0.57..6.61 rows=1 width=96)
   ->  Nested Loop Left Join  (cost=0.57..6.61 rows=1 width=96)
         ->  Index Scan using organizations_pkey on organizations  (cost=0.29..3.30 rows=1 width=63)
               Index Cond: (id = 1)
         ->  Index Scan using index_organization_isolations_on_organization_id on organization_isolations  (cost=0.29..3.30 rows=1 width=33)
               Index Cond: (organization_id = 1)

References

How to set up and validate locally

We can test this by querying for an Organization in a Rails console:

# Additional organizations
my_org = ::Organizations::Organization.find_or_create_by(path: 'my-org-isolated') { |o| o.name ='My Isolated Organization' }

# Mark my_org as isolated
my_org.mark_as_isolated!

# Enable feature flag
Feature.enable(:data_isolation)

# Assign Current.organization
Current.organization = my_org

# This will have an additional where clause
puts ::Organizations::Organization.where(path: 'test').to_sql

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rutger Wessels

Merge request reports

Loading