Discovery: Load balancer as single entry point
In https://docs.gitlab.com/ee/architecture/blueprints/cloud_connector/ the primary proposal evolves around standing up a dedicated Cloud Connector service fronting all feature-specific backends.
We have meanwhile discovered use cases such as !131577 (merged) where this could lead to scalability issues and where the risk of downtime might be too great and not outweigh the benefits a dedicated service would give us otherwise.
An alternative option could instead be to route traffic through a load balancer than can also make routing decisions. This provides the following trade-offs:
- Compromise on our ability to run CC specific logic such as enforcing rate limits in a central location.
- Improve our ability to handle unusually large amounts of traffic or data volumes.
- Maintain our ability to provide a central entry point for GitLab instances.
Outcome
We identified Cloudflare and Google External App LBs as two promising approaches.
We created https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/24711 to implement one of these approaches.
High-level requirements
We're looking for a solution that meets the following requirements:
Must haves:
- listens at
cloud.gitlab.com
- can handle a large amount of traffic
- can make simple routing decisions based on URL path (
/ai/*
,/observe/*
etc.) - must support backends deployed through various platforms, at minimum:
- GKE
- Cloud Run (Runway)
- be available/deployed in January 2024
Should haves:
- ability to operate in various geographic regions
- ability to make feature backends private i.e. not face the public internet
- should provide migration/evolution path to support more complex behavior, either as part of the LB itself (think Envoy filters/extensions) or by plugging something "smarter" behind it
Possible architecture
Evolution
This approach still leaves room for the possibility of a dedicated CC service at some point in the future. For example, we could evolve the architecture as follows:
Today:
- route
cloud.gitlab.com/observe/*
directly toobserve.gitlab.com
to ingest logs and traces - route
cloud.gitlab.com/ai/*
to the AI gateway
Later:
- keep intact
- route
cloud.gitlab.com/*
to Cloud Connector service instead, which may decorate the request and then route/*
to the final destination