Skip to content

System failure with FF_PRINT_POD_EVENTS

After upgrading to v16.5.0 I wanted to test the new FF_PRINT_POD_EVENTS feature but it doesn't seem to work for me:

Running with gitlab-runner 16.5.0 (853330f9)
  on DevOps Kubernetes WestUS2 Azure - AMD64 78GLy9cD6, system ID: r_JA1KverkOlMn
  feature flags: FF_RETRIEVE_POD_WARNING_EVENTS:true, FF_PRINT_POD_EVENTS:true
Resolving secrets 00:00
Preparing the "kubernetes" executor 00:00
Using Kubernetes namespace: default
Using Kubernetes executor with image debian:stable-slim ...
Using attach strategy to execute scripts...
Preparing environment 00:01
Using FF_USE_POD_ACTIVE_DEADLINE_SECONDS, the Pod activeDeadlineSeconds will be set to the job timeout: 1h0m0s...
Subscribing to Kubernetes Pod events...
ERROR: Job failed (system failure): prepare environment: unknown (get events). Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

Some noted errors in events

3m44s                   Normal    Scheduled           Pod/runner-78gly9cd6-project-33599309-concurrent-0-ugrjqtet   Successfully assigned indigo-k8s-amd64-test-runner/runner-78gly9cd6-project-33599309-concurrent-0-ugrjqtet to aks-amd64runner-37812845-vmss000a3k
2m40s (x8 over 3m44s)   Warning   FailedMount         Pod/runner-78gly9cd6-project-33599309-concurrent-0-ugrjqtet   MountVolume.SetUp failed for volume "kube-api-access-2ngj2" : failed to fetch token: pod "runner-78gly9cd6-project-33599309-concurrent-0-ugrjqtet" not found
101s                    Warning   FailedMount         Pod/runner-78gly9cd6-project-33599309-concurrent-0-ugrjqtet   Unable to attach or mount volumes: unmounted volumes=[kube-api-access-2ngj2], unattached volumes=[scripts logs docker-socket repo kube-api-access-2ngj2]: timed out waiting for the condition

values.yaml

# Defaults from https://gitlab.com/gitlab-org/charts/gitlab-runner/blob/main/values.yaml

image:
  registry: registry.gitlab.com
  image: gitlab-org/gitlab-runner
  tag: ubuntu-v16.5.0
imagePullPolicy: Always

gitlabUrl: https://gitlab.com/
checkInterval: 3
concurrent: 360
unregisterRunners: true
terminationGracePeriodSeconds: 3600

metrics:
  enabled: true
  portName: metrics
  port: 9252

service:
  enabled: true

nodeSelector:
  kubernetes.azure.com/mode: "system"
  kubernetes.io/arch: "amd64"

tolerations:
  - key: "CriticalAddonsOnly"
    operator: "Exists"

rbac:
  create: true
  rules:
    - apiGroups: [""]
      resources: ["pods"]
      verbs: ["list", "get", "watch", "create", "delete"]
    - apiGroups: [""]
      resources: ["pods/exec"]
      verbs: ["create"]
    - apiGroups: [""]
      resources: ["pods/log"]
      verbs: ["get"]
    - apiGroups: [""]
      resources: ["pods/attach"]
      verbs: ["list", "get", "create", "delete", "update"]
    - apiGroups: [""]
      resources: ["secrets"]
      verbs: ["list", "get", "create", "delete", "update"]
    - apiGroups: [""]
      resources: ["configmaps"]
      verbs: ["list", "get", "create", "delete", "update"]
    - apiGroups: [""]
      resources: ["events"]
      verbs: ["list"]

podSecurityContext:
  runAsUser: 999
  fsGroup: 999

# Workaround cache errors https://gitlab.com/gitlab-org/gitlab-runner/-/issues/3802
preEntrypointScript: |
  sed -i '/\[runners.cache.gcs\]/d' /home/gitlab-runner/.gitlab-runner/config.toml
  sed -i '/\[runners.cache.azure\]/d' /home/gitlab-runner/.gitlab-runner/config.toml

runners:
  cache:
    secretName: s3access
  secret: gitlab-token
  tags: "small-amd64-k8s-uswest2-azure,indigo-k8s-small-amd64"
  name: "DevOps Kubernetes WestUS2 Azure - AMD64"
  config: |
    [[runners]]
      pre_build_script = '''
         # If docker CLI exists wait for dockerd to start
         # Docker is accessed via unix socket on k8s runners
         unset DOCKER_HOST
         unset DOCKER_CERT_PATH
         unset DOCKER_TLS_VERIFY
         if command -v docker &> /dev/null; then
           i=1; while [ $i -le 10 ]; do
             echo "docker command found, waiting for dockerd service $i/10..."
             docker version &> /dev/null && break
             sleep 1
             if [ $i -eq 10 ]; then
               echo "WARNING docker cli detected but dockerd service not found, continuing build..."
             fi
             i=$(( i + 1 ))
           done
         fi
      '''
      [runners.feature_flags]
        # Retrieve Pod warnings on job failure
        FF_PRINT_POD_EVENTS = true
        FF_RETRIEVE_POD_WARNING_EVENTS = true
      [runners.cache]
        Type = "s3"
        Path = ""
        Shared = false
        [runners.cache.s3]
          ServerAddress = "detoolsminio.minio:9000"
          BucketName = "gitlab-cache"
          Insecure = true
          BucketLocation = "none"
      [runners.kubernetes]
        namespace = "{{.Release.Namespace}}"
        poll_timeout = 1800
        # Default image if non-specified in .gitlab-ci.yml
        image = "debian:stable-slim"
        helper_image = "registry.gitlab.com/gitlab-org/gitlab-runner/gitlab-runner-helper:x86_64-v${CI_RUNNER_VERSION}"
        # Required for docker to create containers
        privileged = true
        allow_privilege_escalation = true
        pull_policy = "if-not-present"
        # Requests fit 2 jobs/node without exceeding 95% node utilization
        cpu_request = "1700m"
        cpu_limit = "1700m"
        memory_request = "5700Mi"
        memory_limit = "5700Mi"
        # Docker-in-Docker 1 job/node
        service_cpu_request = "1700m"
        service_cpu_limit = "3400m"
        service_memory_request = "5700Mi"
        service_memory_limit = "11400Mi"
      [runners.kubernetes.node_tolerations]
        "gitlab-runner=true" = "NoSchedule"
      [runners.kubernetes.node_selector]
        "kubernetes.azure.com/agentpool" = "amd64runner"
        "kubernetes.io/arch" = "amd64"
      [[runners.kubernetes.host_aliases]]
      # Workaround scale up DNS not resolving race condition
        ip = "172.65.251.78"
        hostnames = ["gitlab.com"]
      [[runners.kubernetes.volumes.empty_dir]]
        # Makes /var/run/docker.sock available to all containers in pod
        name = "docker-socket"
        mount_path = "/var/run/"
        path = "/var/run/"
        medium = "Memory"