Generate resources following the default template - Rails
Summary
Generate resources following the default templates. This is a part of GitLab-managed Kubernetes resources (&16130).
Proposal
Here's a proposal:
- Support
resource_management.enabled
field in the agent config. This will be a flag to enable the Gitlab managed cluster resources feature. - Create a new class in
lib/gitlab/ci/build/prerequisite/
, inheriting fromGitlab::Ci::Build::Prerequisite::Base
. - Add the new class to the list of possible classes
- Implement the
#unmet?
and#complete!
methods on the new prerequisite class. Rails will send three requests to KAS - (1) request KAS and retrieve the template, (2) request KAS to render the template, and (3) request KAS to apply the template.
We can leverage the same framework that the certificate-based managed resources used, which is known in the code as a Build Prerequisite. The general idea of a prerequisite is that it gives us an opportunity to perform some non-trivial work when a job is scheduled to start, right before it is picked up by a runner. When a job that has prerequisites is queued, a Sidekiq worker is started to perform whatever actions are required (and, being a worker we can make long-running network requests/polling/etc here). The pipeline state machine holds the job in this state (known as 'preparing') until the worker is finished and marks the job as ready to be picked by a runner. The job/pipeline then continue as normal.
A good (and in fact the only) existing example of a prerequisite is
KubernetesNamespace
, where you will see the logic we currently use to create namespaces (and other resources) for certificate-based clusters.To create a new prerequisite:
- Create a new class in
lib/gitlab/ci/build/prerequisite/
, inheriting fromGitlab::Ci::Build::Prerequisite::Base
.- Add the new class to the list of possible classes
- Implement the
#unmet?
and#complete!
methods on the new prerequisite class:
#unmet?
is how we know if we need to perform any work. It should returntrue
when we need to create resources, andfalse
in every other case (not needed to begin with, or needed but already complete). This method is called synchronously in some web requests, so we cannot perform expensive or long-running logic here (I believe calls to KAS will be ok, however we need to be careful as this has very high throughput).#complete!
performs the work (or waits for something else to do it). Long-running requests/actions are allowed here. In our case I'm imagining this will repeatedly poll KAS until it returns a success/completed status.With a prerequisite in place and the extra calls added to
Kas::Client
, there should be no additional plumbing required (except for a feature flag) to get an initial implementation working. If the first iteration involves removing resources, we will need to use the the environment state machine - after transitioning tostopped
, notify KAS that it is time to remove associated resources.