Introduces the gitlab-database-data_isolation gem

What does this MR do and why?

Introduces the gitlab-database-data_isolation gem — a standalone, configurable query transformer that adds transparent row-level data isolation for the organization model.

The gem provides:

  • Gitlab::Database::DataIsolation — configuration entry point (strategy, sharding_key_map, current_sharding_key_value, on_stats, on_error).
  • Context — thread-local flag to enable/disable isolation per request or test, with without_data_isolation { } escape hatch.
  • Arel strategy — patches ActiveRecord::Relation#arel globally; injects WHERE <table>.<column> = <value> nodes for every table listed in sharding_key_map. Works without any per-model change. Overhead: ~5 µs above the AR baseline.

The strategy is selected via config.strategy this MR ships only :arel.

A shared sql_spec.rb is included to document expected SQL output across common query patterns (SELECT, JOIN, subquery, CTE). Later MRs extend it for additional strategies.

References

How to set up and validate locally

Create a test.rb file in Gitlab root directory, using the Example script below, and run gdk rails runner test.rb

The first call to Namespace.where(path: 'root').first&.organization_id will return 1, the second will return nil because Current.organization is now configured and a filter on that organization_id is automatically applied.

Example script:
require_relative 'gems/gitlab-database-data_isolation/lib/gitlab-database-data_isolation'

# Setup / configure the gem
sharding_key_map = Gitlab::Database::Dictionary.entries.each_with_object({}) do |entry, map|
  sharding_key = entry.sharding_key
  next unless sharding_key.is_a?(Hash) && sharding_key.any?

  map[entry.key_name] = sharding_key.transform_values(&:to_sym)
end

Gitlab::Database::DataIsolation.configure do |config|
  config.strategy = :arel
  config.sharding_key_map = sharding_key_map
  config.current_sharding_key_value = ->(type) {
    case type
    when :organizations
      Current.organization_assigned ? Current.organization&.id : nil
    end
  }
end


Gitlab::Database::DataIsolation.install!

# Query: SELECT "namespaces"."id", <snip> FROM "namespaces" WHERE "namespaces"."path" = 'root' ORDER BY "namespaces"."id" ASC LIMIT 1
puts "This will have results:"
pp Namespace.where(path: 'root').first&.path

# Assign a Current.organization
Current.organization = Organizations::Organization.find_or_create_by!(name: 'my org', path: 'my-org') unless Current.organization_assigned

# Query: SELECT "namespaces"."id", <snip> FROM "namespaces" WHERE "namespaces"."path" = 'root' AND "namespaces"."organization_id" = 1000 ORDER BY "namespaces"."id" ASC LIMIT 1
puts "This will not have result because the isolation is applied"
pp Namespace.where(path: 'root').first&.path

# Query: SELECT "namespaces"."id", <snip> FROM "namespaces" WHERE "namespaces"."path" = 'root' ORDER BY "namespaces"."id" ASC LIMIT 1
Gitlab::Database::DataIsolation::ScopeHelper.without_data_isolation do
  puts "This will have results because we explicitly disable the isolation"
  pp Namespace.where(path: 'root').first&.path
end

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rutger Wessels

Merge request reports

Loading