Docs: Reorganize "Setting up Geo" for clarity
Problem
The current structure has multiple points of confusion.
Current docs landing page https://docs.gitlab.com/ee/administration/geo/setup/
Proposal
We need to look at the overall organization and move things around. It will expose gaps, but at least the gaps will be explicit rather than hidden. They can be filled afterward, and it will be clear what they should be filled with.
Click here to expand an outline of the **current** docs
- Setting up Geo
- Prerequisites: You have two working GitLab sites on the same version, and the primary has a Premium license.
- Using Omnibus GitLab
- Confirm the requirements for running Geo are met.
- Operating system and dependency version requirements
- Firewall rules
- Geo Tracking Database (noting it as a new service)
- Geo Log Cursor (noting it as a new service)
- Set up database replication
- Single instance database replication
- PostgreSQL replication
- Configure primary
- Configure secondary
- Initiate PG replication
- PostgreSQL replication
- Multi-node database replication
- Migrating from repmgr to Patroni
- Migrating a single PostgreSQL node to Patroni
- Patroni support
- Configuring Patroni cluster for a Geo secondary site
- Step 1. Configure Patroni permanent replication slot on the primary site
- Step 2. Configure the internal load balancer on the primary site
- Step 3. Configure PgBouncer nodes on the secondary site
- Step 4. Configure a Standby cluster on the secondary site
- Migrating a single tracking database node to Patroni
- Configuring Patroni cluster for the tracking PostgreSQL database
- Step 1. Configure PgBouncer nodes on the secondary site
- Step 2. Configure a Patroni cluster
- Step 3. Configure the tracking database on the secondary sites
- Configuring Patroni cluster for a Geo secondary site
- Single instance database replication
- Configure fast lookup of authorized SSH keys in the database. This step is required and needs to be done on both the primary and secondary sites.
- Configure GitLab to set the primary and secondary sites
- Configuring a new secondary site
- Step 1. Manually replicate secret GitLab values
- Step 2. Manually replicate the primary site’s SSH host keys
- Step 3. Add the secondary site
- Step 4. (Optional) Using custom certificates
- Custom or self-signed certificate for inbound connections
- Connecting to external services that use custom certificates
- Step 5. Enable Git access over HTTP/HTTPS
- Step 6. Verify proper functioning of the secondary site
- Selective synchronization
- Git operations on unreplicated repositories
- Upgrading Geo
- Configuring a new secondary site
- Optional: Configure Object storage
- Optional: Configure a secondary LDAP server for the secondary sites
- Optional: Configure Geo secondary proxying for a unified URL
- Follow the Using a Geo site guide
- Confirm the requirements for running Geo are met.
- Using GitLab Charts (link to Charts Geo doc for now)
- Post-installation documentation
Note that we also have Geo for multiple servers doc.
Click here to expand **proposed** outline
- Setting up Geo
- Prerequisites: You have two working GitLab sites on the same version, and the primary has a Premium license.
- Using Omnibus GitLab
- Confirm the requirements for running Geo are met.
- Operating system and dependency version requirements
- Firewall rules
- Geo Tracking Database (noting it as a new service)
- Geo Log Cursor (noting it as a new service)
-
Add this Configure Geo on both sites. The documentation only covers symmetrical architectures at the moment, but Geo can be configured with asymmetrical sites.
- Add this. Move some of Single instance database replication here Configure Geo 1k sites
- Add this. Move most of "Geo for multiple servers" into this doc Configure Geo 2k sites
- Add this. Copy from "Geo for multiple servers", but reference "Multi-node database replication" Configure Geo 3k and up sites
- Remove this link from Setting up Geo Set up database replication
- Configure GitLab to set the primary and secondary sites
- Configuring a new secondary site
- Step 1. Manually replicate secret GitLab values
- Step 2. Manually replicate the primary site’s SSH host keys
- Step 3. Add the secondary site
- Step 4. (Optional) Using custom certificates
- Custom or self-signed certificate for inbound connections
- Connecting to external services that use custom certificates
- Step 5. Enable Git access over HTTP/HTTPS
- Step 6. Verify proper functioning of the secondary site
- Moved here. It is not a hard requirement. You could choose to disable Git over SSH entirely. Configure fast lookup of authorized SSH keys in the database. This step is required if using Git over SSH and needs to be done on both the primary and secondary sites.
-
Add "(Optional)" Selective synchronization
- Git operations on unreplicated repositories
- Moved here (Optional) Configure Object storage
- Moved here (Optional) Configure a secondary LDAP server for the secondary sites
- Moved here (Optional) Configure Geo secondary proxying for a unified URL
- Upgrading Geo
- Configuring a new secondary site
- Follow the Using a Geo site guide
- Confirm the requirements for running Geo are met.
- Using GitLab Charts (link to Charts Geo doc for now)
- Post-installation documentation
New doc Configure Geo tracking database
- Configure Geo tracking database on 1k (single-server) site
- You get this for free with
geo_primary_role
andgeo_secondary_role
- You get this for free with
- Moved from "Geo for multiple servers" Configure standalone Geo tracking database
- Moved from "Set up database replication" Migrating a single tracking database node to Patroni
-
Moved from "Set up database replication" Configuring Patroni cluster for the tracking PostgreSQL database
- Moved from "Set up database replication" Step 1. Configure PgBouncer nodes on the secondary site
- Moved from "Set up database replication" Step 2. Configure a Patroni cluster
- Moved from "Set up database replication" Step 3. Configure the tracking database on the secondary sites
Existing doc Set up database replication
-
Extract some of this into "Configure Geo 1k sites" Single instance database replication
- PostgreSQL replication
- Configure primary
- Configure secondary
- Initiate PG replication
- PostgreSQL replication
-
Add this. Don't use
geo_secondary_role
etc. This is a notable gap in current docs. Standalone PostgreSQL database replication - Multi-node database replication
- Migrating from repmgr to Patroni
- Migrating a single PostgreSQL node to Patroni
- Patroni support
- Configuring Patroni cluster for a Geo secondary site
- Step 1. Configure Patroni permanent replication slot on the primary site
- Step 2. Configure the internal load balancer on the primary site
- Step 3. Configure PgBouncer nodes on the secondary site
- Step 4. Configure a Standby cluster on the secondary site
- Configuring Patroni cluster for a Geo secondary site
Also see another proposed outline (we might go with this) #389514 (comment 1629664177).
Edited by Achilleas Pipinellis