Skip to content

GitLab CI should be able to use specific Geo secondary to clone from

Description

Some companies can have a lot of load from CI cloning from Primary node, for numerous reasons. This can be even worst when a few projects are hotspot.

One possible use-case for Geo could be to work as a "mirror" to lower the pressure on the primary node for the automated builds.

Proposal

User should be able to define (per project), that builds for that project should clone from specified Geo node, instead of only from master.

The replication lag could be an issue here as we could be trying to clone from a remote that is not yet in sync. We can mitigate this, by checking for the existence of the target SHA, and if not found, fallback to fetching the missing objects from primary.

This should shift most of the heavy load to the secondary. In situations like GitLab's own build, the fact we have few initial builds in the pipeline means only those will potentially touch the primary, the next step of the pipeline will probably be 100% from the secondary, so this is a huge win.

Links / references

Documentation blurb

Overview

What is it? Why should someone use this feature? What is the underlying (business) problem? How do you use this feature?

Use cases

BigCorp doing lots of automated tests for a single project, with many developers on the team. You don't want the CI to be competing resources with developers slowing down both.

Feature checklist

Make sure these are completed before closing the issue, with a link to the relevant commit.

cc @stanhu @nick.thomas @dbalexandre @ayufan @markpundsack

Edited by 🤖 GitLab Bot 🤖