Persist runner features to ci_runner_machines

To gracefully cancel jobs we're persisting the runner's capabilities in the p_ci_builds_metadata table when we assign the job to the runner. This is not great because p_ci_builds_metadata uses TOAST to store a lot of data and we're rewriting the entire row for each update, wasting system performance: https://gitlab.com/gitlab-com/gl-infra/capacity-planning-trackers/gitlab-com/-/issues/1601#note_1796319334

Ci::RunnerManager is already used to store the runner's configuration. We should extend this to also store the runner's features and use it to replace the p_ci_builds_metadata.runtime_runner_features column.

Proposal

  • Add runtime_features column to ci_runner_machines
  • extend the heartbeat to persist the features
  • change the cancellation logic to read data from ci_runner_machines. While we don't need to backfill the values, we need to be careful when switching so that we don't affect the running jobs.
  • drop the p_ci_builds_metadata.runtime_runner_features column in #534290 (closed)
Edited Apr 04, 2025 by Marius Bobin
Assignee Loading
Time tracking Loading