Geo: Add Unreplicated Data Types
Engineering Owner: @dbalexandre ### Summary (for the Release Post; describes potential future state) GitLab uses a number of different data types other than Git data itself (e.g. Docker registries, GitLab Pages). GitLab Geo now replicates all data types. This ensures that in case of a fail-over, all data is available on a **secondary** and operations can be restored quickly. We have also updated the documentation to ensure that it is easy to understand what data types exist and how they are replicated. ### Problem to solve <!-- What problem are we solving for them? Tight problem description that everyone can rally around. --> GitLab is more than Git data - as a single tool for the entire DevOps lifecycle a GitLab instance contains many different data types - currently 19. Some of these data are large and geo-replication helps maintain a consistent user experience. Other data types are also important in disaster recovery situations. Currently, Geo does not support replication from a **primary** to all **secondaries**. This means that some data won't be initially available after a fail-over in a disaster recovery situation. Customers using Geo expect that features and functionalities they care about are replicated. ### Proposal * ~~Enumerate all current GitLab produced data types~~ [Documentation](https://docs.gitlab.com/ee/administration/geo/replication/#limitations-on-replicationverification) * Implement Geo support for all non-replicated data types * Ensure that new data types can be added within a single iteration ### Current GitLab produced data types The current list can be found [here](https://docs.gitlab.com/ee/administration/geo/replication/#limitations-on-replicationverification). ### Higher intent A major use-case for Geo is Disaster Recovery and ensuring that all *relevant* data is replicated is crucial to move this product category forward. Our customers trust GitLab with their data and they need to trust us even more to recover their data. Replicating data for e.g. GitLab Pages is crucial to move towards a complete DR solution. ### Intended users <!-- Who's the target user? Target user description. --> * [Systems administrators](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sidney-systems-administrator) * [Software developers](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) ### Further details * https://about.gitlab.com/handbook/engineering/security/data-classification-policy.html * https://docs.gitlab.com/ee/administration/geo/replication/#limitations-on-replicationverification * Other product managers need to be aware of Geo's importance ### What does success look like, and how can we measure that? There are currently **19** different data types. Success is when +90% of data is replicated and verified via Geo. Success is also when new data types can be added in an understood and testable manner within one release. Please keep in mind that some data types have significantly higher priority and we do sync almost all major data types. #### Current success metrics: * 9 out of 19 (~47%) of all data are replicated. * 4 out of 19 (~21%) data types are automatically verified ### What is the type of buyer? * Premium * Ultimate ### Links / references
epic