[meta] GitLab Geo (Read-Only secondary servers)
# Geo Decisions * We support only PostgreSQL * Avatar, LFS, builds artifacts, attachments will be solved either by CephFS or any opensource S3 alternative *(this will be done after GA release)* * We are doing a simple hack with attachments and displaying them from primary until above is solved * We moved to use SystemHooks for repository sync coordination *(from buffered updates notification)* * Use SystemHooks for any missing coordination despite database replication * What doesn&#39;t have SystemHooks should implemented as a SystemHook if make sense * Advantages: minimal code difference between CE and EE, more people are using SystemHooks than custom mechanism * Disadvantage: communication layer costs more *(sidekiq job on every push multiplied by amount of secondary servers)* * We use SafeWebhooks implementation to validate Hooks from primary * Authentication in secondary is done by OAuth protocol, authenticating against primary server *(for web)* * For git you can use either username && password (https://) or SSH key (ssh://) * When logging off secondary you will be logged of primary as well *(Single Sign Out)* ## Proposal <details><summary>Complete</summary> - [x] Geo: Cannot delete secondary node if it&#39;s the only node present (gitlab-org/gitlab-ee#374) - [x] Geo: Improvements and fixes after QA (gitlab-org/gitlab-ee!354) - [x] Geo: Merge requests on Secondary should not check mergeable status (gitlab-org/gitlab-ee!366) - [x] Geo: Benchmark (#560) - [x] Wiki page events webhook should include Wiki attributes (gitlab-org/gitlab-ce#17507) - [x] Omnibus tries to create Postgres extension on read only DB: (gitlab-org/gitlab-ee#628) (gitlab-org/omnibus-gitlab!829) - [x] Geo: The redirect URI included is not valid - OAuth (gitlab-org/gitlab-ee#650) (gitlab-org/gitlab-ee!444) - [x] Omnibus: manage custom SSL certificate (gitlab-org/omnibus-gitlab#712) - [x] Improve UI for users in a Geo node (gitlab-org/gitlab-ee#640) - [x] Improve `gitlab:env:info` (gitlab-org/gitlab-ee!459) - [x] Geo: Move Wiki Sync to use SystemHooks (#1482) - [x] Geo: Documentation improvements for 8.9 (gitlab-org/gitlab-ee!431) (Can wait) - [x] Improve required SSH Keys documentation for Geo (!431) - [x] Fix error in admin dashboard when Geo is enabled and current node is nil (#785) - [x] Geo: when license doesn&#39;t include Geo you can&#39;t disable it anymore (#788) - [x] Geo: improve project view UI to guide users how to clone/push from Geo secondary node (#789) - [x] Geo: Replicate repository creation (#1071) - [x] Geo: more documentation improvements for 8.13 (!766) - [x] Geo: Display Custom Avatars (user, project and group) in secondary nodes (#1128) - [x] Geo: repository is updated but displays old cached data in Web UI (#1129) - [x] Geo: Backfill repositories from primary node without using rsync (#1190) - [x] Omnibus - Geo: Generate SSH keys for gitlab user (gitlab-org/omnibus-gitlab#1680) - [x] Database Cache doesn&#39;t work as expected for Geo (gitlab-org/gitlab-ee#1217) - [x] Geo will not let you clone from Secondary on 8.13 (gitlab-org/gitlab-ee#1243) - [x] Geo: Improve Repository Sync (gitlab-org/gitlab-ee#1493) - [x] Geo: Backfill stopped working after 8.15.3 (gitlab-org/gitlab-ee#1645) - [x] Geo: Support v4 API for GitLab Geo endpoints (gitlab-org/gitlab-ee!1256) </details> - [x] %"10.2" **GENERAL AVAILABILITY** - [x] Improve GitLab Import rake task to work with Hashed Storage and Subgroups gitlab-org/gitlab-ce#36509 - [x] Geo repository sync worker attempts to sync repos on unhealthy shards in non-backfill conditions #3690 - [X] Make Geo::RepositorySyncWorker and Geo::FileDownloadDispatchWorker max_capacity configurable #3532 - [x] Use HTTPS cloning for Geo #3341 - [X] Geo: backfill and log cursor attempt to sync wikis unconditionally #3569 - [X] Fix file descriptor leak #3664 - [x] Geo: restarting sidekiq doesn&#39;t cause BaseSchedulerWorker leases to be returned #3568 - [x] Fix geo route whitelisting #3274 - [X] Secondaries forget they are #3074 - [x] Geo queue not drained #3373 - [x] Trimming the Geo event log #3577 - [x] API support for retrieving Geo status #3740 - [x] Geo secondary help users not waste time on impossible operations #2524 ~usability - [x] Geo secondaries do not handle upload or pages transfers when a project is renamed #3674 - [x] Build integration test framework to spin up GitLab Geo on two nodes #3765 - [x] Import old attachments into Uploads table gitlab-org/gitlab-ce#29240 - [x] Improve/revise documentation for GA #3831 - [x] Improve error recovery of failed repository/download sync #3119 - [x] Review Security Architecture #3865 - [x] Provide instructions for SSL with PostgreSQL #1745 - [x] Document non-standard SSL #2857 - [x] Doc to add secondary node to db before starting #3400 - [x] Doc what omnibus Geo roles do #2825 - [x] Doc: order of installation #3497 - [x] Documentation improvements #3831 - [X] Workhorse to support Geo over https gitlab-org/gitlab-workhorse#149 - [X] Allow sync retry on secondaries to be disabled #3810 - [X] Sidekiq db pool size should match thread count in Geo #3809 - [X] FileDownloadDispatchWorker only enqueued hourly #3771 - [ ] %10.3 **PERFORMANCE AND MONITORING FOR GITLAB.COM SCALE** - [ ] Improve Geo Nodes admin screen #3195 - [x] Track rate of download failures with Prometheus metrics #3244 - [ ] Support for CI build logs and artifacts #2388 ~artifacts - [ ] Manual failover #1921 - [ ] Geo: Make it easier to find out why a repository failed to clone #2968 - [ ] postgres_fdw support for Geo secondary node gitlab-org/omnibus-gitlab#2760 - [ ] postgres_fdw support for Geo secondary node #3382 - [ ] Increase parallelism of repo sync for cloud migration #3147 - [ ] GeoNodeStatus calculates numbers inefficiently (requires postgres_fdw) #3699 - [ ] Support for container registry #2870 ~artifacts - [ ] Send GitLab version in status page and verify that all versions are the same #2115 ~"feature proposal" - [x] Detect and warn about broken replication slots on the Geo primary #3617 ~"feature proposal" - [ ] Remove SSH cloning support from Geo #3891 - [ ] Notify administrators when a node fails to sync #1816 ~"feature proposal" - [x] Geo Monitoring #727 - [x] Track replication status based on DR tables #2815 - [x] Geo repository sync workers attempts to sync repos on unhealthy shards in non-backfill conditions #3690 ~regression - [ ] Support different object storage zone in secondary (external object storage replication) - [ ] Enable slow query logs on Geo secondary (Geo testbed) - [x] Warn when Geo replication is proceeding over HTTP, rather than HTTPS #3904 - [ ] Document Geo HA #3646 - [ ] Build testbed with GitLab HA enabled gitlab-com/infrastructure#3082 - [ ] Message when pushing to Geo secondary should be more descriptive #3945 ~usability - [ ] Backlog - [ ] Geo: Support clustered deployments with chained replication #3448 ~"feature proposal" - [ ] GitLab CI should be able to use specific Geo secondary to clone from #3294 ~"feature proposal" - [ ] Provide configuration to override Geo SSH sync URL #2744 ~"feature proposal" - [ ] Investigate frontend changes for the "Auditor" user, to reuse in Geo #1709 ~"feature proposal" - [ ] Add a blank state for the GitLab Geo feature in the Administration panel #1363 ~"feature proposal" - [ ] Better Elasticsearch support for GitLab Geo #1186 ~"feature proposal" - [ ] Geo: Hybrid Synchronization #623 ~"feature proposal" - [ ] Support for Git LFS with object storage #415 ~lfs - [ ] Allow Geo selective replication to include personal namespaces #3659 ~"feature proposal"
issue