Skip to content

Spike: External CI jobs proof of concept

Description

Tracking issue for creating an external jobs proof of concept. The primary purpose of this spike is to assess the potential complexity of the feature, and uncover any potential blockers.

Definition of done

  • Working (unpolished) MR & local demo of external jobs - !133400 (closed) (see MR for screenshots and setup instructions)
  • Seperate demo, or summary of how external jobs will work with our primary use case: Waiting for Flux to sync an OCI artifact, and continuing when Flux sends a notification.
  • Summary of complexity/risk involved

Implementation notes

Points of interest:

  • The new state, running_externally, is straightforward to add in isolation, however adding a state also requires supporting changes to logic in various models (build/bridge/stage/pipeline), some of which are difficult to understand the possible side effects of.
  • Reusing Ci::Build (or GenericCommitStatus, which was also proposed in the epic) removed a lot of the setup cost, however it means there is a huge amount of functionality that is not applicable (and cannot be applicable) to this type of job. It would be preferable to split this into its own model/table.
  • Similarly, using Ci::Config::Entry::Processable as the basis for building CI objects provides a lot of useful functionality, as well as some that is incompatible (the variables and when keywords). Again, we'd need to treat external jobs as a seperate entity altogether, which would mean adding some extra abstraction.
  • Enforcing timeouts: Usually a timeout would be enforced based on how long a runner has spent on a jab, which is not applicable in this case. Instead, we need to set the started_at timestamp manually, and schedule a worker when the external job starts that fails the job when the timeout is reached.

Compatibility with Flux

  • Git repositories can work out of the box using a generic webhook notification (the existing GitLab notification provider calls the CommitStatus API, which is separate to this feature. Perhaps in the future we can modify this, or create a new provider).
  • OCI images are more complicated, because the revision reported by Flux is the digest of the artifact itself, not a git revision. The git revision is present in the artifact metadata, but this is not sent as part of the notification payload (this feature has been requested in https://github.com/fluxcd/notification-controller/issues/195). So, we need a way to link the container digest with the git revision it was built from, which will allow retrieving the correct pipeline upon receiving a notification from Flux.
Edited by Tiger Watson