Operational validation for Cloud Native Hybrid with OpenBao

Summary

Validate OpenBao functionality in Cloud Native Hybrid (CNH) environments to identify edge cases and ensure proper integration. This testing will use a 3k CNH setup with OpenBao deployed manually or via test automation.

Problem Statement

OpenBao needs validation in multi-node, production-like CNH environments to:

  • Validate deployment scenarios
  • Ensure proper functionality with external services and security configurations
  • Test operational procedures (install/upgrades, backup/restore, monitoring)

High-Level Plan

  1. Environment Setup

    • Deploy 3k CNH environment using GET with external Database (AWS via GitLab Sandbox Cloud, example 10k setup for reference)
    • Increase Support node pool size and resources for OpenBao pods
    • Configure OpenBao manually or via custom automation (similar to existing database prep)
  2. Operational Scenarios

    • Install/upgrade scenarios with different GitLab versions
    • Backup/restore procedures (leveraging PG backup strategy)
    • Monitoring setup and metrics collection
    • Autoscaling behavior (vertical scaling validation)

Note: Geo deployment scenarios are covered by #583442.

Test Scenarios

Installation & Configuration

  • Install N-1 GitLab version with gitlab_version specified
  • Upgrade GitLab with OpenBao enabled
  • Custom user-provided TLS certificates

We can't test the GitLab upgrade scenario right now because the latest version of GitLab (18.6) is the only one where OpenBao can be successfully enabled. Also, we recently made changes that would require a data migration but while working towards beta we didn't invest in migrating OpenBao data – instead we simply did a reset.

Operations & Monitoring

  • Custom monitoring setup picks up OpenBao metrics
  • Autoscaling behavior validation

Auto-scaling doesn't apply at the moment b/c OpenBao doesn't support horizontal scalability yet.

Monitoring has been tested on GitLab CN (Chart deployment on GKS) but couldn't be tested on GitLab CNH (using GET) because this depends on gitlab-org/cloud-native/charts/openbao#31 (closed), which will be released as part of GitLab 18.7.

DR & High Availability

  • Failover procedures
  • Recovery validation
  • Database backup/restore validation
  • OpenBao HA validation

Failover and recovery scenarios are covered by Operational validation of Geo support for OpenB... (#583442).

Backup and restore can't be validated now b/c OpenBao uses a separate database that's not managed by GitLab backup tools.

Known Limitations

Exit Criteria

Limitations should be covered by admin docs as part of #573065.

Note: Geo support (multi-region deployment and failover working) is verified as part of #583442.

Edited by Fabien Catteau