Org Migration: Design how to handle shards and its dependent tables

Background

See also https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/impacted_features/git-access

The shards table holds the name of a shard / aka Gitaly storage.

In #498991 (closed), the table was declared gitlab_main_cell.

This shards table is referenced by 4 other tables, all of which are gitlab_main_cell:

  • group_wiki_repositories
  • pool_repositories
  • project_repositories
  • snippet_repositories
Click to expand entity relationship diagram of these tables

Mermaid Playground

erDiagram
    shards {
        int id PK
        string name "Name of Gitaly storage/shard"
    }
    
    group_wiki_repositories {
        int group_id PK,FK
        int shard_id FK
        string disk_path
    }
    
    "project_wiki_repositories delegates repository_storage to projects" {
        int id PK
        int project_id FK
    }
    
    pool_repositories {
        int id PK
        int shard_id FK
        string disk_path
        int source_project_id
    }
    
    project_repositories {
        int id PK
        int project_id FK
        int shard_id FK
        string disk_path
    }
    
    snippet_repositories {
        int snippet_id PK,FK
        int shard_id FK
        string disk_path
    }
    
    projects {
        int id PK
        string repository_storage "Name of Gitaly storage/shard"
        int pool_repository_id FK "Optional"
    }

    snippets {
        int id PK
    }

    namespaces {
        int id PK
    }
    
    group_wiki_repository_states {
        int id PK
        int group_wiki_repository_id FK
        string verification_state
    }
    
    project_states {
        int id PK
        int project_id FK
        string verification_state
    }
    
    wiki_repository_states {
        int id PK
        int project_wiki_repository_id FK
        int project_id FK
        string verification_state
    }
    
    snippet_repository_states {
        int id PK
        int snippet_repository_id FK
        string verification_state
    }
    
    shards ||--o{ group_wiki_repositories : "has many"
    shards ||--o{ pool_repositories : "has many"
    shards ||--o{ project_repositories : "has many"
    shards ||--o{ snippet_repositories : "has many"
    
    pool_repositories ||--o{ projects : "used by"
    
    group_wiki_repositories ||--o| group_wiki_repository_states : "has one"
    project_repositories ||--o| project_states : "has one"
    "project_wiki_repositories delegates repository_storage to projects" ||--o| wiki_repository_states : "has one"
    snippet_repositories ||--o| snippet_repository_states : "has one"
    
    projects ||--o| project_repositories : "has one"
    projects ||--o| "project_wiki_repositories delegates repository_storage to projects" : "has one"
    snippets ||--o| snippet_repositories : "has one"
    namespaces ||--o| group_wiki_repositories : "has one"

Problem

When the dependent tables are moved by Org Mover, how should we handle the shards table ?

Click to expand example

Screenshot_2024-12-11_at_5.55.27_PM

-- source

Each Cell will have its own unique Gitaly storages. By definition, the shards table will need to reflect these unique Gitaly storages.

After Org Mover moves a particular Gitaly repository from Cell A to Cell B, it will need to update the shard_id column

Proposal

  • Investigate and decide how Org Mover will operate during replication and during cutover
  • Investigate and decide if there's any application changes we need to make to the Rails monolith

Decision

Acceptance Criteria

  • Decisions are documented in https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/cells/migration/
  • Org Mover MVC epic is aligned with these decisions. For example, if an implementation issue is needed, then open it.
Edited Sep 11, 2025 by Michael Kozono
Assignee Loading
Time tracking Loading