Skip to content

Discussion: Support Running Pipelines in Ops.GitLab.net

Overview

Runway will be used to deploy a new global service that will be in the critical path for GitLab.com. It's possible that GitLab.com would not be available if the Global Service has a bug (bad deployment). Currently, the Runway deploy pipelines run on GitLab.com so we end up having a circular dependency because we would not be able to deploy a fix if GitLab.com is down.

We have a known way to fix this, running pipelines in https://ops.gitlab.net/. We need to add support in Runway to add the ability to run the deployment pipelines in ops.GitLab.net

Scope

This issue addresses the runway service deployment pipelines, specifically the pipeline in the deployment project of a runway service. e.g. https://gitlab.com/gitlab-com/gl-infra/platform/runway/deployments/schin1-ai-assist-1579xx/-/pipelines/1281209714

Provisioner's CI pipeline issues are addressed in #255 (closed).

Approach

.com deployments will remain as the default go-to option which majority of runway services should use., while ops deployment is an option for critical services (affects .com availability).

image

As shown above: the ops deployment flow will be

  1. commit on .com service project
  2. mirrored to ops service project
  3. ops mirror (service project) runs a pipeline and triggers a downstream pipeline in ops deployment project
  4. .com service project runs a pipeline but should not trigger any downstream pipelines

The ops mirror needs to exist so that we can still deploy critical services in the event gitlab.com is unavailable.

There will be a tradeoff for using ops deployment which is reduced visibility into downstream pipelines. Developers/operators for critical services could get ops access to get around this. We could resolve this with #264 (closed) but it is not a high priority feature.

doc link for discussion: https://docs.google.com/document/d/1lrJ0HhpfX1niGN7pV5RWATmoPUnFyNy6M6Y3XVwtIGs

Closing summary

This issue is more of a discussion issue. The main implementation work is tracked in #323 (closed)

Edited by Sylvester Chin