[meta] GitLab Geo (Read-Only secondary servers)
# Geo Decisions
* We support only PostgreSQL
* Avatar, LFS, builds artifacts, attachments will be solved either by CephFS or any opensource S3 alternative *(this will be done after GA release)*
* We are doing a simple hack with attachments and displaying them from primary until above is solved
* We moved to use SystemHooks for repository sync coordination *(from buffered updates notification)*
* Use SystemHooks for any missing coordination despite database replication
* What doesn't have SystemHooks should implemented as a SystemHook if make sense
* Advantages: minimal code difference between CE and EE, more people are using SystemHooks than custom mechanism
* Disadvantage: communication layer costs more *(sidekiq job on every push multiplied by amount of secondary servers)*
* We use SafeWebhooks implementation to validate Hooks from primary
* Authentication in secondary is done by OAuth protocol, authenticating against primary server *(for web)*
* For git you can use either username && password (https://) or SSH key (ssh://)
* When logging off secondary you will be logged of primary as well *(Single Sign Out)*
## Proposal
<details><summary>Complete</summary>
- [x] Geo: Cannot delete secondary node if it's the only node present (gitlab-org/gitlab-ee#374)
- [x] Geo: Improvements and fixes after QA (gitlab-org/gitlab-ee!354)
- [x] Geo: Merge requests on Secondary should not check mergeable status (gitlab-org/gitlab-ee!366)
- [x] Geo: Benchmark (#560)
- [x] Wiki page events webhook should include Wiki attributes (gitlab-org/gitlab-ce#17507)
- [x] Omnibus tries to create Postgres extension on read only DB: (gitlab-org/gitlab-ee#628) (gitlab-org/omnibus-gitlab!829)
- [x] Geo: The redirect URI included is not valid - OAuth (gitlab-org/gitlab-ee#650) (gitlab-org/gitlab-ee!444)
- [x] Omnibus: manage custom SSL certificate (gitlab-org/omnibus-gitlab#712)
- [x] Improve UI for users in a Geo node (gitlab-org/gitlab-ee#640)
- [x] Improve `gitlab:env:info` (gitlab-org/gitlab-ee!459)
- [x] Geo: Move Wiki Sync to use SystemHooks (#1482)
- [x] Geo: Documentation improvements for 8.9 (gitlab-org/gitlab-ee!431) (Can wait)
- [x] Improve required SSH Keys documentation for Geo (!431)
- [x] Fix error in admin dashboard when Geo is enabled and current node is nil (#785)
- [x] Geo: when license doesn't include Geo you can't disable it anymore (#788)
- [x] Geo: improve project view UI to guide users how to clone/push from Geo secondary node (#789)
- [x] Geo: Replicate repository creation (#1071)
- [x] Geo: more documentation improvements for 8.13 (!766)
- [x] Geo: Display Custom Avatars (user, project and group) in secondary nodes (#1128)
- [x] Geo: repository is updated but displays old cached data in Web UI (#1129)
- [x] Geo: Backfill repositories from primary node without using rsync (#1190)
- [x] Omnibus - Geo: Generate SSH keys for gitlab user (gitlab-org/omnibus-gitlab#1680)
- [x] Database Cache doesn't work as expected for Geo (gitlab-org/gitlab-ee#1217)
- [x] Geo will not let you clone from Secondary on 8.13 (gitlab-org/gitlab-ee#1243)
- [x] Geo: Improve Repository Sync (gitlab-org/gitlab-ee#1493)
- [x] Geo: Backfill stopped working after 8.15.3 (gitlab-org/gitlab-ee#1645)
- [x] Geo: Support v4 API for GitLab Geo endpoints (gitlab-org/gitlab-ee!1256)
</details>
- [x] %"10.2" **GENERAL AVAILABILITY**
- [x] Improve GitLab Import rake task to work with Hashed Storage and Subgroups gitlab-org/gitlab-ce#36509
- [x] Geo repository sync worker attempts to sync repos on unhealthy shards in non-backfill conditions #3690
- [X] Make Geo::RepositorySyncWorker and Geo::FileDownloadDispatchWorker max_capacity configurable #3532
- [x] Use HTTPS cloning for Geo #3341
- [X] Geo: backfill and log cursor attempt to sync wikis unconditionally #3569
- [X] Fix file descriptor leak #3664
- [x] Geo: restarting sidekiq doesn't cause BaseSchedulerWorker leases to be returned #3568
- [x] Fix geo route whitelisting #3274
- [X] Secondaries forget they are #3074
- [x] Geo queue not drained #3373
- [x] Trimming the Geo event log #3577
- [x] API support for retrieving Geo status #3740
- [x] Geo secondary help users not waste time on impossible operations #2524 ~usability
- [x] Geo secondaries do not handle upload or pages transfers when a project is renamed #3674
- [x] Build integration test framework to spin up GitLab Geo on two nodes #3765
- [x] Import old attachments into Uploads table gitlab-org/gitlab-ce#29240
- [x] Improve/revise documentation for GA #3831
- [x] Improve error recovery of failed repository/download sync #3119
- [x] Review Security Architecture #3865
- [x] Provide instructions for SSL with PostgreSQL #1745
- [x] Document non-standard SSL #2857
- [x] Doc to add secondary node to db before starting #3400
- [x] Doc what omnibus Geo roles do #2825
- [x] Doc: order of installation #3497
- [x] Documentation improvements #3831
- [X] Workhorse to support Geo over https gitlab-org/gitlab-workhorse#149
- [X] Allow sync retry on secondaries to be disabled #3810
- [X] Sidekiq db pool size should match thread count in Geo #3809
- [X] FileDownloadDispatchWorker only enqueued hourly #3771
- [ ] %10.3 **PERFORMANCE AND MONITORING FOR GITLAB.COM SCALE**
- [ ] Improve Geo Nodes admin screen #3195
- [x] Track rate of download failures with Prometheus metrics #3244
- [ ] Support for CI build logs and artifacts #2388 ~artifacts
- [ ] Manual failover #1921
- [ ] Geo: Make it easier to find out why a repository failed to clone #2968
- [ ] postgres_fdw support for Geo secondary node gitlab-org/omnibus-gitlab#2760
- [ ] postgres_fdw support for Geo secondary node #3382
- [ ] Increase parallelism of repo sync for cloud migration #3147
- [ ] GeoNodeStatus calculates numbers inefficiently (requires postgres_fdw) #3699
- [ ] Support for container registry #2870 ~artifacts
- [ ] Send GitLab version in status page and verify that all versions are the same #2115 ~"feature proposal"
- [x] Detect and warn about broken replication slots on the Geo primary #3617 ~"feature proposal"
- [ ] Remove SSH cloning support from Geo #3891
- [ ] Notify administrators when a node fails to sync #1816 ~"feature proposal"
- [x] Geo Monitoring #727
- [x] Track replication status based on DR tables #2815
- [x] Geo repository sync workers attempts to sync repos on unhealthy shards in non-backfill conditions #3690
~regression
- [ ] Support different object storage zone in secondary (external object storage replication)
- [ ] Enable slow query logs on Geo secondary (Geo testbed)
- [x] Warn when Geo replication is proceeding over HTTP, rather than HTTPS #3904
- [ ] Document Geo HA #3646
- [ ] Build testbed with GitLab HA enabled gitlab-com/infrastructure#3082
- [ ] Message when pushing to Geo secondary should be more descriptive #3945 ~usability
- [ ] Backlog
- [ ] Geo: Support clustered deployments with chained replication #3448 ~"feature proposal"
- [ ] GitLab CI should be able to use specific Geo secondary to clone from #3294 ~"feature proposal"
- [ ] Provide configuration to override Geo SSH sync URL #2744 ~"feature proposal"
- [ ] Investigate frontend changes for the "Auditor" user, to reuse in Geo #1709 ~"feature proposal"
- [ ] Add a blank state for the GitLab Geo feature in the Administration panel #1363 ~"feature proposal"
- [ ] Better Elasticsearch support for GitLab Geo #1186 ~"feature proposal"
- [ ] Geo: Hybrid Synchronization #623 ~"feature proposal"
- [ ] Support for Git LFS with object storage #415 ~lfs
- [ ] Allow Geo selective replication to include personal namespaces #3659 ~"feature proposal"
issue