POC: Organization-scoped read-only mode (controller-layer enforcement)

Context

This POC replaces the namespace-scoped middleware approach explored in #590009 (closed). Based on review feedback on !226983 (closed), the team agreed on two key changes:

  1. Scope: Read-only mode should be scoped to Organizations, not individual TLGs. By the time read-only mode is needed (isolated Organizations / severance / Cell transfers), every TLG will already belong to an Organization.
  2. Architecture: Enforcement should happen at the controller layer (Rails before_action / Grape helpers), not in Rack middleware. This aligns with how Organization resolution already works (CurrentOrganization concern, Gitlab::Current::Organization) and avoids brittle path-prefix parsing.

Why the pivot?

  • @abdwdd flagged that the fixed PATH_PREFIXES route set is fragile and should follow the Gitlab::Current::Organization pattern
  • @rutgerwessels confirmed that Organization resolution uses the controller layer, not middleware
  • @mandrewsgl clarified that initial TLG-to-Org transfers are atomic and don't need read-only mode
  • @alexpooley confirmed read-only is needed for isolation enablement and Cell-to-Cell Organization transfers
  • @dblessing noted only single-transaction scenarios (isolated org transfers) require read-only

Use cases

  1. Isolation enablement (severance): Organization enters read-only during data modifications
  2. Cell-to-Cell Organization transfer: Transactional transfer requires read-only to prevent data inconsistency

Relationship to previous work


Existing Infrastructure

Organization resolution pattern (the model to follow)

  • CurrentOrganization concern (app/controllers/concerns/current_organization.rb): included in BaseActionController, calls set_current_organization as a before_action after Rails routing resolves params
  • Gitlab::Current::Organization (lib/gitlab/current/organization.rb): resolves Organization from params[:namespace_id], params[:group_id], headers, user, or fallback
  • API::Helpers#set_current_organization: Grape equivalent

Enforcement pattern (the pattern to follow)

  • EnforcesStepUpAuthenticationForNamespace (app/controllers/concerns/enforces_step_up_authentication_for_namespace.rb): already included in both Groups::ApplicationController and Projects::ApplicationController as a before_action after group/project is loaded. This is the exact injection pattern to follow.

Grape API helpers

  • API::Helpers#find_group! and API::Helpers#find_project! (lib/api/helpers.rb): central lookup methods every Grape endpoint uses. Natural injection points for read-only checks.

Proposed Implementation Plan

Scope note: This is a POC. Maintenance mode toggled via Rails console. Goal is to demonstrate feasibility.

Dependency Graph

Step 1: Organization state machine
Step 2: Rails controller enforcement (before_action concern)  ┐
Step 3: Grape API enforcement (find_group!/find_project!)      ├── depend on Step 1; parallel
Step 4: Web UI error handling                                  │
Step 5: GraphQL mutation enforcement                           ┘

Step 1: Organization State Machine

Add a state column to organizations table and include a Stateful-like concern on Organizations::Organization with :maintenance state and transitions (start_maintenance!, complete_maintenance!).

Alternative: Derive maintenance state from root namespaces' effective_state. Simpler but couples Organization read-only to namespace state.

Effort: Small (1-2 days)

Step 2: Rails Controller Enforcement

Create EnforcesReadOnlyOrganization concern following the EnforcesStepUpAuthenticationForNamespace pattern:

  • Include in Groups::ApplicationController with before_action after @group is loaded
  • Include in Projects::ApplicationController with before_action after @project is loaded
  • Resolve Organization from loaded group/project via namespace.organization
  • Check if Organization is in :maintenance state
  • Write requests: return 503 (JSON) or redirect with flash (HTML)
  • Read requests: allow through

Key files:

  • app/controllers/concerns/enforces_read_only_organization.rb (new)
  • app/controllers/groups/application_controller.rb
  • app/controllers/projects/application_controller.rb

Effort: Medium (2-3 days) | Depends on: Step 1 — completed in !228743 (closed) completed in !228743 (closed)

Step 3: Grape API Enforcement

Add maintenance checks in API::Helpers at central lookup methods:

  • In find_group!: after check_group_access, check organization maintenance state for write requests
  • In find_project!: same pattern
  • Return 503 Service Unavailable with Retry-After header

Effort: Small (1-2 days) | Depends on: Step 1 | Parallel with: Step 2

Step 4: Web UI Error Handling

  • HTML requests: redirect back with flash alert
  • JSON/XHR requests: structured 503 JSON with Retry-After header
  • Consider dedicated maintenance error page

Effort: Small (1 day) | Depends on: Step 2

Step 5: GraphQL Mutation Enforcement

All GraphQL is POST to /api/graphql, so HTTP method filtering won't work. Enforce at mutation resolver level:

  • Shared concern for mutations that resolves target Organization and checks maintenance state
  • Return GraphQL::ExecutionError for blocked mutations
  • Queries (reads) remain unaffected

Effort: Medium (2-3 days) | Depends on: Step 1 — completed in !228743 (closed) completed in !228743 (closed) | Parallel with: Steps 2-4

Deferred to Follow-Up

  • Admin UI/API toggle for maintenance mode
  • Background write path audit (Sidekiq workers, cron jobs)
  • Write-on-GET guards (per @abdwdd's feedback)
  • Git operation enforcement (git-receive-pack blocking)

Open Questions

  1. Organization state model: Own state column on organizations, or derive from root namespaces?
  2. TLG-level granularity: Is Organization-level sufficient, or do we need per-TLG granularity within an Organization? (@alexpooley, @dblessing)
  3. Multi-TLG Organizations: If only one TLG needs maintenance, Organization-level blocks all TLGs. Acceptable?
  4. Cell transfer scope: Is the entire Organization locked during Cell transfer?
  5. Current.organization: Can we leverage it directly in the enforcement concern?

How to Validate Locally

  1. Apply state machine changes (Step 1)
  2. Rails console: org = Organizations::Organization.find(id); org.start_maintenance!
  3. Write request to a group in that org (e.g., create issue) -> blocked with 503
  4. Read request (e.g., view group page) -> works normally
  5. org.complete_maintenance! -> writes work again
Edited by Chen Zhang