Skip to content

POC of Geo Protocell Mode

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Problem/Proposal

We had a couple ideas for how to use Geo mostly as-is for migrating data to a Protocell. (See Meeting Notes below.)

In this issue, we should do a POC of the Protocells mode idea, timeboxed to 2 days.

Meeting notes

From Notes - Org Data Migration - Sync/Office Hours:

Sep 18, 2025 | [REC] Org Data Migration Sync

  1. Michael Kozono: Geo will probably need some modifications to configure a Protocell as a secondary site while the Protocell is a live, writable site at the same time. I haven’t looked at specifically what modifications will be needed and how much work that will be. I think I need to do a POC locally.
    1. Michael Kozono There was the idea to run Rails processes on the side that have different configuration (configured as Geo secondary site)
    2. Douglas Alexandre What about Protocells mode, distinct from primary and secondary site. Runs all Geo jobs
      1. Michael Kozono What about just normal jobs + Geo secondary site jobs?
      2. Douglas Alexandre I think it’s fine to run the primary checksum jobs
    3. Douglas Alexandre We need to modify CronManager
    4. Douglas Alexandre Need to deploy tracking DB
    5. Douglas Alexandre Geo Health status. Not streaming replication, logical replication.

Plan

Partially configure Geo. (Protocell to act as secondary Geo site of Legacy Cell without any PG replication and without breaking anything.) Fix/bandaid things along the way, potentially introduce a third type of Geo site: Protocells mode. Stop at timebox of 2 days.

  • Legacy Cell: Set geo_node_name
  • Legacy Cell: rake geo:set_primary_node
  • Legacy Cell: In UI, add a secondary site with the Protocell attributes
  • Legacy Cell: Dump geo_nodes table
  • Protocell: Set geo_node_name
  • Protocell: Insert geo_nodes rows

Replicate org 1. Bandaid things along the way. Stop at timebox of 2 days.

  • Legacy Cell: Block org 1 users
  • Legacy Cell: Set selective sync by org 1 (starts checksumming non-PG data)
  • Legacy Cell: Dump PG data for org 1
  • Protocell: Insert PG data for org 1
  • Protocell: Set selective sync by org 1
  • Wait for replication of non-PG data

Ideas

Protocells mode would be relevant at call sites of Gitlab::Geo.primary? and Gitlab::Geo.secondary? . We could add Gitlab::Geo.protocell?.

Edited by 🤖 GitLab Bot 🤖