Skip to content

Garbage collection supports for kubernetes executor

Romuald Atchadé requested to merge kube-executor-resources-clean-up into main

What does this MR do?

This MR adds the support of Garbage Collector for the kubernetes executor. It therefore adds a owner-dependent relationship between the pod and resources associated.

Why was this MR needed?

This MR is needed to ensure the cleanup of resources when the pod associated is deleted.

What's the best way to test this MR?

Tests configurations

config.toml
[[runners]]
  name = "kubernetes"
  url = "https://gitlab.com/"
  token = "YOUR_TOKEN_HERE"
  executor = "kubernetes"
  [runners.kubernetes]
    image = "alpine:3.11"

For the tests two .gitlab-ci.yml will be used. However the testing steps will be the same for both.

.gitlab-ci.yml with kubernetes legacy execution
variables:
  DURATION: 600
  FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY: "true"

job:
  script:
  - 'for i in $(seq 1 $DURATION); do echo $(date); sleep 1; done'
  - echo "done"
.gitlab-ci.yml without kubernetes legacy execution
variables:
  DURATION: 600

job:
  script:
  - 'for i in $(seq 1 $DURATION); do echo $(date); sleep 1; done'
  - echo "done"

Test steps for pod deletion during job

The steps below must be used for each .gitlab-ci.yml configuration

  1. Start gitlab-runner build with the kube-executor-resources-clean-up branch

  2. Start a job

  3. Display the list created pods:

    $ kubectl get pods --namespace default
    NAME                                                 READY   STATUS    RESTARTS   AGE
    runner-lr33aybb-project-24422682-concurrent-0kt6tk   2/2     Running   0          12s

    Note: The pod name can be verified in the job log on gitlab.com

  4. Display the list of resources (for our test case: secrets and configMaps)

    $ kubectl get secrets,configMaps --namespace default
    NAME                                                        TYPE                                  DATA   AGE
    secret/default-token-xkhjm                                  kubernetes.io/service-account-token   3      29h
    secret/runner-lr33aybb-project-24422682-concurrent-0tw7bq   kubernetes.io/dockercfg               1      112s
    
    NAME                                                                   DATA   AGE
    configmap/kube-root-ca.crt                                             1      29h
    configmap/runner-lr33aybb-project-24422682-concurrent-0-scripts85xqn   5      112s

    Note: No configMaps displayed when FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY = "true"

  5. Delete the pod

    $ kubectl delete pod runner-lr33aybb-project-24422682-concurrent-0kt6tk --namespace default 
    pod "runner-lr33aybb-project-24422682-concurrent-0kt6tk" deleted
  6. Check if the resources are still availables. Only the secret/default-token-xkhjm and configmap/kube-root-ca.crt should be left

    $ kubectl get secrets,configMaps --namespace default                                        
    NAME                         TYPE                                  DATA   AGE
    secret/default-token-xkhjm   kubernetes.io/service-account-token   3      29h
    
    NAME                         DATA   AGE
    configmap/kube-root-ca.crt   1      29h

    Note: The job with the following error :

Job Failure - FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY: false

Log

Cleaning up file based variables
ERROR: Job failed (system failure): pods "runner-lr33aybb-project-24422682-concurrent-05mf5p" not found

Screen_Shot_2021-08-05_at_12.25.25_AM

Job Failure - FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY: true

Log

Cleaning up file based variables
ERROR: Error cleaning up pod: pods "runner-lr33aybb-project-24422682-concurrent-0tzq8t" not found
ERROR: Job failed: command terminated with exit code 137

Screen_Shot_2021-08-05_at_12.29.36_AM


Test steps with the kube-executor-resources-clean-up branch

The steps below must be used for each .gitlab-ci.yml configuration

  1. Start gitlab-runner build with the kube-executor-resources-clean-up branch

  2. Start a job

  3. Display the list created pods:

    $ kubectl get pods --namespace default
    NAME                                                 READY   STATUS    RESTARTS   AGE
    runner-lr33aybb-project-24422682-concurrent-0kt6tk   2/2     Running   0          12s

    Note: The pod name can be verified in the job log on gitlab.com

  4. Display the list of resources (for our test case: secrets and configMaps)

    $ kubectl get secrets,configMaps --namespace default
    NAME                                                        TYPE                                  DATA   AGE
    secret/default-token-xkhjm                                  kubernetes.io/service-account-token   3      29h
    secret/runner-lr33aybb-project-24422682-concurrent-0tw7bq   kubernetes.io/dockercfg               1      112s
    
    NAME                                                                   DATA   AGE
    configmap/kube-root-ca.crt                                             1      29h
    configmap/runner-lr33aybb-project-24422682-concurrent-0-scripts85xqn   5      112s

    Note: No configMaps displayed when FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY = "true"

  5. Stop gitlab-runner

    A verification will show that the pod and resources are still available:

    $ kubectl get pods --namespace default && kubectl get secrets,configMaps --namespace default
    NAME                                                 READY   STATUS    RESTARTS   AGE
    runner-lr33aybb-project-24422682-concurrent-0kt6tk   2/2     Running   0          4m30s
    
    NAME                                                        TYPE                                  DATA   AGE
    secret/default-token-xkhjm                                  kubernetes.io/service-account-token   3      29h
    secret/runner-lr33aybb-project-24422682-concurrent-0tw7bq   kubernetes.io/dockercfg               1      4m46s
        
    NAME                                                                   DATA   AGE
    configmap/kube-root-ca.crt                                             1      29h
    configmap/runner-lr33aybb-project-24422682-concurrent-0-scripts85xqn   5      4m46s
  6. Delete the pod

    $ kubectl delete pod runner-lr33aybb-project-24422682-concurrent-0kt6tk --namespace default 
    pod "runner-lr33aybb-project-24422682-concurrent-0kt6tk" deleted
  7. Check if the resources are still availables. Only the secret/default-token-xkhjm and configmap/kube-root-ca.crt should be left

    $ kubectl get secrets,configMaps --namespace default                                        
    NAME                         TYPE                                  DATA   AGE
    secret/default-token-xkhjm   kubernetes.io/service-account-token   3      29h
    
    NAME                         DATA   AGE
    configmap/kube-root-ca.crt   1      29h

Test steps with the main branch

Redo the steps 1 to 6 with the main branch and get similar results

  1. Check if the resources are still availables. The resources available on step 5 should still be available
    $ kubectl get secrets,configMaps --namespace default                                        
    NAME                                                        TYPE                                  DATA   AGE
    secret/default-token-xkhjm                                  kubernetes.io/service-account-token   3      29h
    secret/runner-lr33aybb-project-24422682-concurrent-0tw7bq   kubernetes.io/dockercfg               1      4m46s
    
    NAME                                                                   DATA   AGE
    configmap/kube-root-ca.crt                                             1      29h
    configmap/runner-lr33aybb-project-24422682-concurrent-0-scripts85xqn   5      4m46s

What are the relevant issue numbers?

closes : #4184 (closed)

Edited by Romuald Atchadé

Merge request reports