Runner Fleet Reference Architectures

Description

We publish reference architectures for self-managed GitLab installations, based on prototypical activity for a given number of users. The testing our quality team performs against these reference architectures helps assure prospects and users that they are planning for a "right-sized" starting point.

Proposal

We should consider publishing validated reference architectures for medium to large-sized runner fleets. This will help us share objective information about the resource demands of the runner-manager in common contexts, and aid our users in planning for a appropriately sized cluster|VM|machine to manage their runner fleet.

Because organizations have limited "GitLab-native" visibility into job wait times and overall "CI quality of experience", it's important that they appropriately size and scale their runner infrastructure at the outset, so that developer QoE remains high.

MVC Ideas

We have dashboards/reporting/visibility into the resource demands of our shared runner managers for GitLab.com. This infrastructure hosts a very heterogenous collection of workloads from many different organizations and users. It's reasonable to posit that these resource demands in aggregate, represent a reasonable facsimile of a larger enterprise's

Links to related issues and merge requests / references

[Potentially] Blocked by gitlab#16319 (closed) per @tpazitny

Concerns

It can be difficult to generalize the types of workloads that an organization will run on their CI infra.

Edited Dec 21, 2021 by Jamie Reid