Omnibus Adjacent Kubernetes (OAK) - Operate Implementation: Next Steps
## Summary As part of the Segmentation proposal, an implementation is needed to service Early Self-Managed Advanced. "[Omnibus-Adjacent Kubernetes](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/selfmanaged_segmentation/#omnibus-adjacent-kubernetes-oak)" is meant to service that need, as a transition point between strict Omnibus and Cloud Native GitLab. Now that the [proof of concept](https://gitlab.com/groups/gitlab-org/distribution/-/epics/126) for OAK has been completed, we look forward to the investigation of how to implement this through the product. This epic endeavors to outline the body of work required to more fully implement OAK by members of the Operate group in the Delivery stage. ## Planning Next Steps The work of this Epic should result in the production of two key items: 1. Omnibus is automated to support OAK. 2. An epic is populated to with a concrete plan for the eventual GA of OAK, following the Beta implementation. The work will be delivered by 4 phases represented in sub-epics. Phases 3 and 4 are happening in parallel: - **Phase 1 - [The Discovery epic](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/30):** ✅ Complete - **Phase 2 - [The Beta Design Doc epic](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/33):** ✅ Complete - **Phase 3 - [The Beta Implementation epic](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/36):** 🔄 In Progress - The output of this phase is a consumable version of Omnibus with automation that facilitates the OAK deployment environment for users to experiment with. Beta is a meaningful, standalone deliverable, covering the most common deployment pattern — Single Node deployments. - **Phase 4 [GA Design Doc epic](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/85):** 🔄 In Progress - This will outline a concrete plan for the eventual GA of OAK, following the Beta implementation. ### Delivery Target - `2026-04-30` The target date for conclusion of the 4 phases is represented by GitLab's Due date feature: `2026-04-30`. ![Screenshot_2026-03-10_at_10.31.17](/uploads/6df2c7f6bc21a949a7115279cc255edb/Screenshot_2026-03-10_at_10.31.17.png){width=900 height=326} ## Outcomes By the end of it, we should provide the following outcomes: - [ ] Documented Discovery work - [x] List of vetted kubernetes distributions, and any recommendations against. - We vetted k3s, k0s, microk8s. All valuable solutions. - [x] Define integration patterns. - [x] Service Discovery & Interconnection - [x] Omnibus automation recommendation - [ ] Impact on AirGapped solutions for GA - [ ] Impact on Zero Downtime for GA - [ ] AppSec review for GA - [ ] K8s on Multi-node evaluation for GA - [ ] Omnibus on Multi-node VMs evaluation for GA - [x] Configuration recipes to add to Omnibus - [ ] Feature deliverables - [ ] Omnibus package with the Beta version of OAK support. - Should be consumable by our users interested in installing advanced components. - Should provide automation value to facilitate their integration of these components. - Projects should include test automation. - [x] Assure that teams have a reasonable way to develop advanced components for OAK - [x] Validated and documented integration with Caproni. - [ ] Documentation - [x] Architectural Design Document describing the Beta version. - [ ] Runbook on how to deploy the whole environment. - [ ] Troubleshooting section. - [ ] Planning - [ ] Creation and population of an Epic with a concrete plan for eventual GA of OAK as a subsequent effort ### Investigation Items This section is to guide engineers working on the project on all the technical aspects that have to be considered when going through the epics. The details will provide valuable insights to technical discussions and product decisions. It's collapsed due to its size, and for better general visibility of the epic. <details> <summary>Expand Investigation Items</summary> This section is to guide engineers on how to achieve the outcomes above. ### :one: Stream 1 - Discovery of criteria ([EPIC](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/30)) Define Kubernetes options we will vet, from available, common k8s options for providing OAK. This is not a list of "supported" or "blessed". Our determination here will provide for documentation of requirements. - Small: - k3s: https://k3s.io/ - k0s: https://k0sproject.io/ - microk8s: https://microk8s.io/ - minikube: Not intended for this case. - Containerized: (Perhaps not a great idea?) - k3d: https://k3d.io/stable/ - kind: https://kind.sigs.k8s.io/ - External platforms: (largely already addressed) - Self-hosted: (non-comprehensive list of providers on-premise, largely similar to PaaS) - Rancher RKE: https://docs.rke2.io/ - OpenShift (ocp): https://www.openshift.com/products/openshift-platform - Cloud provider (PaaS): - EKS: https://aws.amazon.com/eks/ - GKE: https://cloud.google.com/kubernetes-engine/ - AKS: https://azure.microsoft.com/en-us/services/kubernetes-service/ ### :two: Stream 2 - Familiar Components We will need to determine implementation requirements for tools which we already know will be particular useful and which we have existing familiarity with, but do not already provide through existing tooling. We will need to list out these tools, and work through various key tasks as a part of integrating into the tooling we provide. For each, we will need to determine the following: - The methods for inclusion. Will we build it ourselves, or ingest external resources with verifiable provenance? - The requirements for configuration, and what level of this should be automated entirely via the Omnibus. - How to deliver these artifacts to the consumer. - Will these be include directly? - Will they be available via OCI bundles? A non-definitive list of tools to start with: Helm, kubectl, skopeo, cosign ### :three: Stream 3 - Delivering Cloud Native Artifacts We will need to determine a viable path for delivering Cloud Native GitLab components to Omnibus GitLab instances. Those may _or may not_ be air-gapped, but some customers would also prefer to have all materials readily available before deploying new versions. Solving for this specifically: We will need to determine methods of delivering Helm charts and possibly containers to these instances, as well as the methods of consuming those artifacts we provide. An option for this would be to deliver OCI bundles via a supplemental package. That package could easily contain both Helm charts as OCI artifacts, as well as potentially containers. Those artifacts could then be populated into a local OCI registry, or in the case of Helm, directly consumed. ### :four: Stream 4 - Discovery for Needed Components A key component to enabling and securing the application as a whole, is the [interconnection of mixed environments](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/selfmanaged_segmentation/#interconnection-of-mixed-environments). We will need to work cross-functionally through ~"department::infrastructure platforms" and various groups in ~"Department::Development" to enable appropriate additions to the application architecture to facilitate the implementation of necessary components. We will need to be aware of [Key Abstractions](https://handbook.gitlab.com/handbook/engineering/architecture/abstractions/) and the architecture board, as we work to incorporate into the [architecture](https://docs.gitlab.com/development/architecture/) per our processes to [add components](https://docs.gitlab.com/development/adding_service_component/) to GitLab. Service discovery and inter-connection: - What to use for each (or both?) needs. - How to include (origin, provenance, ...) into all platforms. - How to automate through Omnibus GitLab _and_ Cloud Native. ### :five: Stream 5 - Runbook Creation Here, we create a set of documented Runbooks of K8s platform options we investigated within [Stream 1](https://gitlab.com/groups/gitlab-com/gl-infra/software-delivery/-/work_items/22#one-stream-1---discovery-of-criteria). Each of these runbooks will describe effectively how to to obtain, install, and maintain an option we have vetted. Further, these will describe how to "attach" the Omnibus GitLab to these "OAK" resources, so that they can be configured as one complete instance. These runbooks can and should, refer to upstream documentation wherever possible. As a reminder, these runbooks are not intended to be a comprehensive and complete list of "supported" or "blessed". They will provide support to engineers performing development and testing, as well as enabling those outside our immediate teams that ability to test and feedback during early stages. </details> ## DRI - `@Alexand` ## Participants * `@Alexand` * `@grigoristh` * `@niskhakova` * `@sbordei` * `@WarheadsSE` ## Exit Criteria ~"section::gitlab delivery" is able to provide a replicable demo and significant documentation of what Segmentation will look like with [Omnibus-Adjust Kubernetes](https://handbook.gitlab.com/handbook/engineering/architecture/design-documents/selfmanaged_segmentation/#omnibus-adjacent-kubernetes-oak) as a means to implementation for Self-Managed customers. - We can describe the criteria for the Kubernetes platform(s) a customer may chose, with an acknowledgement of any tradeoffs. - We can describe where and how tooling will provide for a transitional experience, on the path to pure Cloud Native. - We provide runbooks for options we have vetted, demonstrating the functionality via a first iteration implementation. - GitLab members outside of ~"group::operate" have a script and runbook they can follow to demonstrate how OAK works.
epic