[meta] GitLab Geo (Read-Only secondary servers)
Geo Decisions
- We support only PostgreSQL
- Avatar, LFS, builds artifacts, attachments will be solved either by CephFS or any opensource S3 alternative (this will be done after GA release)
- We are doing a simple hack with attachments and displaying them from primary until above is solved
- We moved to use SystemHooks for repository sync coordination (from buffered updates notification)
- Use SystemHooks for any missing coordination despite database replication
- What doesn't have SystemHooks should implemented as a SystemHook if make sense
- Advantages: minimal code difference between CE and EE, more people are using SystemHooks than custom mechanism
- Disadvantage: communication layer costs more (sidekiq job on every push multiplied by amount of secondary servers)
- We use SafeWebhooks implementation to validate Hooks from primary
- Authentication in secondary is done by OAuth protocol, authenticating against primary server (for web)
- For git you can use either username && password (https://) or SSH key (ssh://)
- When logging off secondary you will be logged of primary as well (Single Sign Out)
Proposal
Complete
- [x] Geo: Cannot delete secondary node if it's the only node present (gitlab-org/gitlab-ee#374) - [x] Geo: Improvements and fixes after QA (gitlab-org/gitlab-ee!354) - [x] Geo: Merge requests on Secondary should not check mergeable status (gitlab-org/gitlab-ee!366) - [x] Geo: Benchmark (#560 (closed)) - [x] Wiki page events webhook should include Wiki attributes (gitlab-org/gitlab-ce#17507) - [x] Omnibus tries to create Postgres extension on read only DB: (gitlab-org/gitlab-ee#628) (omnibus-gitlab!829 (merged)) - [x] Geo: The redirect URI included is not valid - OAuth (gitlab-org/gitlab-ee#650) (gitlab-org/gitlab-ee!444) - [x] Omnibus: manage custom SSL certificate (omnibus-gitlab#712 (closed)) - [x] Improve UI for users in a Geo node (gitlab-org/gitlab-ee#640) - [x] Improve `gitlab:env:info` (gitlab-org/gitlab-ee!459) - [x] Geo: Move Wiki Sync to use SystemHooks (#1482 (closed)) - [x] Geo: Documentation improvements for 8.9 (gitlab-org/gitlab-ee!431) (Can wait) - [x] Improve required SSH Keys documentation for Geo (!431 (merged)) - [x] Fix error in admin dashboard when Geo is enabled and current node is nil (#785 (closed)) - [x] Geo: when license doesn't include Geo you can't disable it anymore (#788 (closed)) - [x] Geo: improve project view UI to guide users how to clone/push from Geo secondary node (#789 (closed)) - [x] Geo: Replicate repository creation (#1071 (closed)) - [x] Geo: more documentation improvements for 8.13 (!766 (merged)) - [x] Geo: Display Custom Avatars (user, project and group) in secondary nodes (#1128 (closed)) - [x] Geo: repository is updated but displays old cached data in Web UI (#1129 (closed)) - [x] Geo: Backfill repositories from primary node without using rsync (#1190 (closed)) - [x] Omnibus - Geo: Generate SSH keys for gitlab user (omnibus-gitlab#1680 (closed)) - [x] Database Cache doesn't work as expected for Geo (gitlab-org/gitlab-ee#1217) - [x] Geo will not let you clone from Secondary on 8.13 (gitlab-org/gitlab-ee#1243) - [x] Geo: Improve Repository Sync (gitlab-org/gitlab-ee#1493) - [x] Geo: Backfill stopped working after 8.15.3 (gitlab-org/gitlab-ee#1645) - [x] Geo: Support v4 API for GitLab Geo endpoints (gitlab-org/gitlab-ee!1256)-
%10.2 GENERAL AVAILABILITY -
Improve GitLab Import rake task to work with Hashed Storage and Subgroups gitlab-org/gitlab-ce#36509 -
Geo repository sync worker attempts to sync repos on unhealthy shards in non-backfill conditions #3690 (closed) -
Make Geo::RepositorySyncWorker and Geo::FileDownloadDispatchWorker max_capacity configurable #3532 (closed) -
Use HTTPS cloning for Geo #3341 (closed) -
Geo: backfill and log cursor attempt to sync wikis unconditionally #3569 (closed) -
Fix file descriptor leak #3664 (closed) -
Geo: restarting sidekiq doesn't cause BaseSchedulerWorker leases to be returned #3568 (closed) -
Fix geo route whitelisting #3274 (closed) -
Secondaries forget they are #3074 (closed) -
Geo queue not drained #3373 (closed) -
Trimming the Geo event log #3577 (closed) -
API support for retrieving Geo status #3740 (closed) -
Geo secondary help users not waste time on impossible operations #2524 (closed) usability -
Geo secondaries do not handle upload or pages transfers when a project is renamed #3674 (closed) -
Build integration test framework to spin up GitLab Geo on two nodes #3765 (closed) -
Import old attachments into Uploads table gitlab-org/gitlab-ce#29240 -
Improve/revise documentation for GA #3831 (closed) -
Improve error recovery of failed repository/download sync #3119 (closed) -
Review Security Architecture #3865 (closed) -
Provide instructions for SSL with PostgreSQL #1745 (closed) -
Document non-standard SSL #2857 (closed) -
Doc to add secondary node to db before starting #3400 (closed) -
Doc what omnibus Geo roles do #2825 (closed) -
Doc: order of installation #3497 (closed) -
Documentation improvements #3831 (closed) -
Workhorse to support Geo over https gitlab-workhorse#149 (closed) -
Allow sync retry on secondaries to be disabled #3810 (closed) -
Sidekiq db pool size should match thread count in Geo #3809 (closed) -
FileDownloadDispatchWorker only enqueued hourly #3771 (closed)
-
-
%10.3 PERFORMANCE AND MONITORING FOR GITLAB.COM SCALE -
Improve Geo Nodes admin screen #3195 (closed) -
Track rate of download failures with Prometheus metrics #3244 (closed) -
Support for CI build logs and artifacts #2388 (closed) ~artifacts -
Manual failover #1921 (closed) -
Geo: Make it easier to find out why a repository failed to clone #2968 (closed) -
postgres_fdw support for Geo secondary node omnibus-gitlab#2760 (closed) -
postgres_fdw support for Geo secondary node #3382 (closed) -
Increase parallelism of repo sync for cloud migration #3147 (closed) -
GeoNodeStatus calculates numbers inefficiently (requires postgres_fdw) #3699 (closed) -
Support for container registry #2870 (closed) ~artifacts -
Send GitLab version in status page and verify that all versions are the same #2115 (closed) ~"feature proposal" -
Detect and warn about broken replication slots on the Geo primary #3617 (closed) ~"feature proposal" -
Remove SSH cloning support from Geo #3891 (closed) -
Notify administrators when a node fails to sync #1816 ~"feature proposal" -
Geo Monitoring #727 (closed) -
Track replication status based on DR tables #2815 (closed) -
Geo repository sync workers attempts to sync repos on unhealthy shards in non-backfill conditions #3690 (closed) regression -
Support different object storage zone in secondary (external object storage replication) -
Enable slow query logs on Geo secondary (Geo testbed) -
Warn when Geo replication is proceeding over HTTP, rather than HTTPS #3904 (closed) -
Document Geo HA #3646 (closed) -
Build testbed with GitLab HA enabled gitlab-com/infrastructure#3082 -
Message when pushing to Geo secondary should be more descriptive #3945 (closed) usability
-
-
Backlog -
Geo: Support clustered deployments with chained replication #3448 ~"feature proposal" -
GitLab CI should be able to use specific Geo secondary to clone from #3294 ~"feature proposal" -
Provide configuration to override Geo SSH sync URL #2744 (closed) ~"feature proposal" -
Investigate frontend changes for the "Auditor" user, to reuse in Geo #1709 (closed) ~"feature proposal" -
Add a blank state for the GitLab Geo feature in the Administration panel #1363 (closed) ~"feature proposal" -
Better Elasticsearch support for GitLab Geo #1186 ~"feature proposal" -
Geo: Hybrid Synchronization #623 ~"feature proposal" -
Support for Git LFS with object storage #415 (closed) ~lfs -
Allow Geo selective replication to include personal namespaces #3659 ~"feature proposal"
-
Edited by James Ramsay (ex-GitLab)