Discussion: Runway team's insight into/permission on service projects

Objective

Create consensus around which expectations we (the Runway team) have towards service projects, and document and communicate these expectations clearly, e.g. as part of an "engagement model".

Background

Runway currently manages three production services: AI Gateway, glgo (aka. auth.gcp.gitlab.com), and PVS. The Runway team has varying level of access to these projects:

  • AI Gateway: Runway team has the "Maintainer" role
  • glgo: Only devs have the "Maintainer" role.
  • PVS: Individuals have the "Maintainer" role, e.g. @cfeick

We recently merged the ci-tasks and runwayctl repositories, which was a relatively large refactoring, requiring an update to the service projects (#117 (closed)). With the repositories where we had merge access we could inform our users of the upcoming change. The "glgo" repository required close collaboration with the devs. It worked great for this migration, but it is not a scalable strategy.

Discussion

We should establish clear expectations for our interaction with production services. Here are some proposed points to start the discussion off:

  • The Runway team (synonymous with the Scalability: Practices team) needs to have the "Maintainer" role on all service projects of production services. Unless there is a more specific agreement, this is only used for Runway related maintenance.

  • Runway becomes a dependency of the service projects, despite not being directly in the serving path. To strike a balance between Runway's desire to deliver more features and the service owner's desire for a stable platform, we publish an SLO. For example, deployments with Runway should fail for no more than one work day per quarter (approximately 98.5% availability).

    It is not yet clear how we could measure such an SLO, unfortunately.

  • Service projects are expected to use a Runway version that is no more than six weeks old. In other words, upgrade to the newest Runway version at least twice per quarter.

  • No guarantees are given for private projects, e.g. personal test projects living outside of gitlab-com. In other words, if the Runway team cannot even inspect the repository, they are sanctioned to behave as if the repository didn't exist.