description:Why we use the state_machine gem backed by organizations.state and organization_details.state_metadata for the Organization lifecycle.
toc_hide:true
---
## Context
Organizations sit at the top of the resource hierarchy and own groups, projects, users, and settings. Their lifecycle needs explicit, machine-enforced control:
- An Organization must not be usable before confirmation.
- Deletion is two-tiered: reversible soft-delete for owners, irreversible hard-delete for admins.
- Every transition must be auditable (who, when, why — including the error on failures).
- Failed transitions must leave the row in a consistent, recoverable state.
The deletion workflow is tracked in [Add ability to delete an Organization](https://gitlab.com/groups/gitlab-org/-/work_items/21433).
## Decision
We manage the Organization lifecycle with the [`state_machine` gem](https://github.com/state-machines/state_machines), backed by:
-`organizations.state` (SMALLINT) — the authoritative state value.
-`organization_details.state_metadata` (JSONB) — the audit trail, validated against a strict JSON Schema on every save.
Low-level infrastructure (metadata writes, logging, transition-user validation) is shared with `Namespaces::Stateful` through four `Gitlab::TenantContainerLifecycle::Stateful` modules.
This ADR records the *mechanism* only. The state catalog, transitions, and conventions for adding new states live in the [Organization Lifecycle](../lifecycle.md) blueprint, which is the single source of truth.
## Consequences
- All state changes go through the state machine — direct assignment to `organizations.state` is invalid.
-`state_metadata` uses `additionalProperties: false`: any MR adding a metadata field must update `organization_detail_state_metadata.json` in the same MR, or saves will fail validation.
- Transition services must pass `transition_user:`; the machine enforces this through `ensure_transition_user`.
- The shared `TenantContainerLifecycle::Stateful` modules must stay backward-compatible with both `Organizations::Stateful` and `Namespaces::Stateful`.
- New states and transitions do not require new ADRs — they ship in the blueprint, the schema, and the state machine. A new ADR is only needed when the mechanism itself changes.
## Alternatives
### Single boolean flag (`active` / `deleted`)
Rejected: a boolean cannot represent intermediate states (confirmation, in-flight hard deletion). No audit trail, no guards.
### Separate columns per concern (`is_confirmed`, `confirmed_at`, `soft_deleted_at`, …)
Rejected: nothing enforces mutual exclusivity, so an Organization could appear simultaneously `confirmed` and mid-hard-deletion. Guards and audit become ad-hoc per-feature code. This is the approach the legacy namespace deletion used (`group_deletion_schedules`, `marked_for_deletion_at`) and that we are moving away from — see the [Group and Project Operations blueprint](../../group_and_project_operations_and_state_management/_index.md).
### Renamed intermediate state (`confirmation_in_progress` / `activation_in_progress`)
Discussed in the [intermediate-state naming thread](https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/19655/diffs#note_3313088904).
Rejected: the `_in_progress` convention in the namespace lifecycle names the background process performing the operation (user says "delete" → `deletion_in_progress`). Here the user is confirming the Organization's structure, not kicking off a "confirmation" process; `confirmation_in_progress` would imply the user is mid-action. `confirmed` + `active` keep the user's completed action and the system's completed activation as two distinct, durable states.
### Reuse `Namespaces::Stateful` directly
Rejected: Organizations are not namespaces — no parent, no inheritance, no archival, no transfer. Sharing the full namespace machine would mean conditional branching for org-specific behavior throughout. The current design shares only the low-level infrastructure modules.
This blueprint details requirements for Organizations to be isolated.
Read more about what an Organization is in [Organization](_index.md).
Isolation flags are orthogonal to the Organization lifecycle (`unconfirmed`, `confirmed`, `active`, etc.) described in [Organization Lifecycle](lifecycle.md), with one dependency: the first isolation step (`isolation_desired`) requires the organization to be `active`.
## What?
All Organization data and functionality in GitLab will be isolated.
description:How Organizations move from creation to soft- and hard-deletion, and how every transition is audited.
status:ongoing
creation-date:"2026-05-05"
authors:["@rymai"]
dris:["@rymai"]
owning-stage:"~devops::tenantscale"
participating-stages:[]
toc_hide:true
---
<!-- Design Documents often contain forward-looking statements -->
<!-- vale gitlab.FutureTense = NO -->
## Summary
An Organization moves through five states: `unconfirmed` → `confirmed` → `active` → `soft_deleted` → `deletion_in_progress`. Owners can soft-delete an `active` Organization (which hides it from the UI and public API) and restore it. Only instance admins can escalate a `soft_deleted` Organization to hard deletion, which is irreversible. Every transition is audited in a JSONB column on `organization_details`.
We use the [`state_machine` gem](https://github.com/state-machines/state_machines) and share low-level infrastructure with `Namespaces::Stateful` through `Gitlab::TenantContainerLifecycle::Stateful` modules. See [ADR 009](decisions/009_state_machine.md) for the rationale.
## Goals and non-goals
Goals:
- A machine-enforced lifecycle with explicit allowed transitions.
- An immutable audit trail for every transition, stored alongside the Organization.
- Reversible soft-deletion for owners; admin-gated hard-deletion for legal/GDPR follow-through.
- Shared infrastructure with the namespace state machine to avoid duplication.
Non-goals:
- Archival (a namespace concept).
- Cross-cell transfer.
- State inheritance — Organizations are roots.
## State diagram
```mermaid
stateDiagram-v2
direction LR
unc: unconfirmed
con: confirmed
act: active
sd: soft_deleted
dip: deletion_in_progress
[*] --> unc : (organization created)
unc --> con : confirm
con --> act : activate
act --> sd : soft_delete
sd --> act : restore
sd --> dip : hard_delete
dip --> [*]
```
There is no `deleted` state — a successful hard deletion destroys the row. `unconfirmed` and `confirmed` have no path to `soft_deleted`: an Organization that has not yet completed activation cannot be deleted.
Every transition records who triggered it through `update_state_metadata`. Failures call `update_state_metadata_on_failure`, which writes `last_error` and emits a structured log without changing state.
Authorization for `soft_delete`, `restore`, and `hard_delete` is enforced at the [service layer](#service-entry-points). The state machine only checks that `transition_user` is supplied.
## Data model
```sql
organizations
stateSMALLINTNOTNULLDEFAULT0
organization_details
soft_deleted_atTIMESTAMPWITHTIMEZONE
state_metadataJSONBNOTNULLDEFAULT'{}'
```
`state_metadata` is validated against a strict JSON Schema (`organization_detail_state_metadata.json`, `additionalProperties: false`):
```json
{
"last_updated_at":"<datetime>",
"last_changed_by_user_id":<integer|null>,
"last_error":"<string | null>",
"correlation_id":"<string | null>",
"soft_deleted_by_user_id":<integer|null>,
"restored_at":"<datetime | null>",
"restored_by_user_id":<integer|null>,
"confirmed_at":"<datetime | null>",
"confirmed_by_user_id":<integer>
}
```
Fields are exposed as typed accessors on `OrganizationDetail` through `jsonb_accessor`.
## Adding a new state or transition
A state-machine change spans two repositories:
1. In `gitlab-org/gitlab`, in a single MR: `Organizations::Stateful` (state enum, `state_machine` block, guards, callbacks) **and**`organization_detail_state_metadata.json` if the new state adds metadata fields. The schema and the code must land together — `additionalProperties: false` will fail saves in production otherwise.
2. In `gitlab-com/content-sites/handbook` (this repository): this blueprint — states table, transitions table, future-work table.
Cross-link the two MRs and merge them together.
Integer values are append-only — assign the next free integer, regardless of lifecycle position.
## Service entry points
Every user-driven transition has a dedicated service that wraps the state-machine event with authorization, idempotency, and audit logging. Each one follows the same shape:
1. Check authorization through `OrganizationPolicy`.
2. Verify the current state is a valid source for the event.
3. Invoke the event with `transition_user: current_user`.
4. Surface state-machine errors as the service response if the transition did not happen.
5. Emit an audit-log event and return a successful `ServiceResponse`.
-`SoftDeleteService` requires the Organization to be empty (no groups nor projects) — soft deletion only hides, and is reversible.
-`HardDeleteService` enqueues the background hard-deletion worker on success; the worker performs the row destruction. Hard deletion is for legal/GDPR follow-through and is not exposed in the standard UI.
## Error handling
When a transition fails (a guard returns `false`):
-`update_state_metadata_on_failure` writes the error to `state_metadata['last_error']` and saves the detail record.
-`log_transition_failure` emits a structured error log.
-`organizations.state` is **never** modified on failure.
If a hard-deletion worker fails partway, the Organization stays in `deletion_in_progress` with `last_error` populated. Recovery is by re-running an idempotent worker, not a state-machine backward transition. A dedicated recovery transition can be added later if we need it.
## Future work
The state machine is in place; the service and API surface still need work:
"Rename pending" rows are issues originally framed around `schedule_deletion` / `cancel_deletion` / `start_deletion` that need re-scoping to the soft-delete / restore / hard-delete naming. Finder changes to hide `soft_deleted` Organizations from non-owners are tracked in [#594312](https://gitlab.com/gitlab-org/gitlab/-/work_items/594312).
## Relationship with Organization Isolation
Lifecycle and [Isolation](isolation.md) are orthogonal. Lifecycle answers *"Is this Organization operational?"*; isolation answers *"How strictly are its data boundaries enforced?"*. They do not share a state machine, and isolation flags can be set independently of soft-deletion.
One dependency: the first isolation step (`isolation_desired`) requires the Organization to be `active`. Triggering isolation in `unconfirmed` or `confirmed` would be premature.
## Open Questions
### Concurrency and locking
Two actors could try to transition the same Organization at once — for example, an owner restores while an admin hard-deletes. Current lean: optimistic locking on `lock_version` is enough. All transitions are human-driven, so contention should be rare. If real-world conflict rates are higher than expected, we can either add a custom pessimistic-lock helper or migrate to [AASM](https://github.com/aasm/aasm#pessimistic-locking), which supports pessimistic locking natively. Decide before the first user-facing surface ships.
### Recovery from `confirmed`-state failures
If background provisioning fails after `confirm`, the Organization stays in `confirmed` indefinitely — there is no path back to `unconfirmed` or forward to a `failed` state. Are we relying on idempotent retries, or do we need a recovery transition? To be decided.
### Initial state for user-created Organizations
`unconfirmed` fits the case where GitLab provisions an Organization for a customer. Once end users create Organizations themselves (post-GA), there is no provisioning step to confirm. Two options:
- Run `confirm` + `activate` synchronously inside the creation service, so `ConfirmationService` side effects still execute.
- Allow `unconfirmed → active` directly (or default user-created rows to `active`) when no side effects are needed.
The choice depends on what side effects, if any, are bound to confirmation by the time self-service ships. See [MR thread](https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/19693#note_3328588386).
### Retention window for `soft_deleted`
Should `restore` be available indefinitely, or expire after a retention window (after which only `hard_delete` is legal)? Indefinite is simplest; a fixed window (for example, 30 days) would match the prior delayed-deletion behavior and GDPR expectations. Decide before `restore` ships behind a UI.
## Alternative Solutions
See [ADR 009](decisions/009_state_machine.md) for the rationale for using a state machine over simpler data models.