Proposal: Replace service inventory with service catalog

Context

Right now, the first step to onboarding is opening MR to add a service to provisioner project, e.g:

{
  "inventory": [
    {
      "name": "my-service",
      "project": "https://gitlab.com/my-org/my-project"
    }
  ]
}

After approval and merging, a pipeline is run to provision the service. While some optional fields are specific to Runway, it is in essence a service catalog.

Proposal

Replace Runway service inventory with GitLab SaaS Platforms service catalog. The first step to onboarding is opening MR to add a service to runbooks project instead, e.g:

services:
  - name: my-service
    owner: my-stage-group
    runway:
      project: https://gitlab.com/my-org/my-project

After approval and merging, a cross-project pipeline would be run to provision the service instead. Absolutely nothing else changes about Runway at all.

Benefits

  • Services would automatically have service and metrics catalog entries, which is requirement for production readiness process
  • Services would automatically be evaluated against service maturity model, which could be expanded with WASF
  • Services would be automatically be owned by teams.yml, which includes context Runway can use for alerts, error budgets, cost attribution, and more
  • Services would automatically include service overview dashboards
  • Services would automatically include SLIs and SLOs
  • Services would automatically include saturation monitoring and capacity planning
  • Services would automatically include docs for on-call SREs
  • Services would automatically have just right amount of friction to avoid making it little too easy to provision infra w/ bunch of test services
  • Services are only onboarded once, we can check off all these boxes off and more by simply changing the single source of truth

Trade-offs

  • CI build times in runbooks are brutal, opportunity for us to speed up builds for everyone
  • Adding new service in runbooks can be trial and error, opportunity for us to add scaffold tooling for everyone (i.e non-runway services)
  • Runway would be further coupled to runbooks, in addition to existing coupling w/ GitLab SaaS Metrics Platform (i.e thanos)
  • Runbooks can seem little overwhelming to non-infra folks, opportunity for us to make infra best practices clearer to stage group owners
  • Experiment and production services would exist in same service catalog, opportunity for us to add formally make this distinction

Long-term

It is starting to become clearer that a descriptor format that exists in source projects are the direction that many tools like Backstage or alternatives are heading. You could even imagine a custom solution w/ runway.yml, Renovate bot, etc.

This proposal would be a simple change that could be an interim step to help us get there by improving both Runway and GitLab SaaS Platforms service catalog.

Edited by Chance Feick