kubernetes: add PodDisruptionBudget support for job pods

What does this MR do and why?

Adds optional PodDisruptionBudget (PDB) support for Kubernetes executor job pods to prevent voluntary evictions during node drains and cluster upgrades.

Problem

When running CI jobs on Kubernetes, node drains (during upgrades, autoscaling, maintenance) can evict job pods, causing job failures. Currently there's no protection against this.

Solution

When enabled, the executor creates a PodDisruptionBudget with minAvailable: 1 for each job pod. This prevents the Kubernetes eviction API from evicting the pod during voluntary disruptions while still allowing:

  • Pod termination when the job completes
  • Involuntary disruptions (node failures, OOM kills)

Configuration

[runners.kubernetes]
  pod_disruption_budget = true  # disabled by default

Or via environment variable:

KUBERNETES_POD_DISRUPTION_BUDGET=true

How it works

  1. After creating the job pod, if pod_disruption_budget is enabled, a PDB is created with:

    • minAvailable: 1 - prevents eviction of the single job pod
    • Label selector matching the job pod's unique label (job.runner.gitlab.com/pod)
    • OwnerReference pointing to the pod (automatic garbage collection)
  2. The PDB is automatically deleted when the pod is deleted (via ownerReference)

  3. Fallback cleanup in cleanupResources() if ownerReference wasn't set

Example PDB created

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: runner-abc123-pdb
  namespace: gitlab-runner
  ownerReferences:
    - apiVersion: v1
      kind: Pod
      name: runner-abc123
      uid: <pod-uid>
spec:
  minAvailable: 1
  selector:
    matchLabels:
      job.runner.gitlab.com/pod: <job-unique-name>

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

  • I have evaluated the MR acceptance checklist for this MR.
  • Tests added for new functionality
  • Documentation updated

Merge request reports

Loading