Skip to content

Add private registry support for operational container scanning

Summary

Customers using Operational Container Scanning(OCS) from %16.0 onwards would not have their private images scanned.

This is not easily detectable since the error only appears in the Trivy scanner pod that OCS parses to retrieve the vulnerability report. After parsing, OCS would delete the Trivy scanner pod.

This bug is caused by the work done to migrate to trivy k8s from starboard operator. The switch was necessary because Starboard is being discontinued. The MR was merged on 27 April 2023 and released as part of the %16.0 milestone. Customers using OCS versions prior to this should not face this issue.

Starboard operator uses the secret associated with the workload to pull private images. Trivy k8s code does not seem to have a similar implementation. I've also manually tested this to confirm that private images are failing to be scanned with the latest version of OCS.

Implementation Plan

A concern has been raised for Plan A that registry credentials could be leaked if private images being scanned are from different private registries. As such we need to explore Plan B as an alternative.

Plan B: Use trivy image CLI and trivy Env Var to scan images individually

  1. Retrieve all workloads to be scanned in the target namespaces
  2. For each workload:
    1. Retrieve the ImagePullSecret or service account ImagePullSecret if it's present and parse secret to retrieve private registry's username and password
    2. Create a secret in gitlab agent namespace containing the username and password obtained above
    3. Start a Pod that runs the trivy image command to scan the workload.
      1. If secret present, define TRIVY_PASSWORD and TRIVY_USERNAME env vars referencing the respective keys in the secret created above
    4. Delete secret once scan completes
  3. Consolidate vulnerabilities to send to rails monolith

Plan A: Use trivy k8s CLI and trivy Env Var to scan images in namespace

  1. For each target namespace:
    1. For each workload in the namespace:
      1. Retrieve the ImagePullSecret or service account ImagePullSecret if it's present and parse secret to retrieve private registry's username and password
    2. Create a secret gitlab agent namespace containing the username and password obtained above
      1. If there are multiple registry credentials, use , to separate them as described in Trivy's docs.
    3. Start a Pod that runs the trivy k8s command to scan the workloads in the namespace.
      1. If secret present, define TRIVY_PASSWORD and TRIVY_USERNAME env vars referencing the respective keys in the secret created above
    4. Delete secret once scan completes
  2. Consolidate vulnerabilities to send to rails monolith

POC of Implementation Plan A:

Created a spike that references the hardcoded secret my-secret-2 in the podspec. To validate that implementation plan A works I did the following.

  1. Build an image of the latest Trivy that contains the code that adds CLI flags to the trivy k8s command and push it to a project container registry. I created this registry.gitlab.com/smtan/ocs_trivy/trivy:latest
  2. Create a Gitlab access token with scopes write_registry and read_registry
  3. Create a k8s secret in my cluster with the access token from above.
    1. kubectl create secret generic my-secret-2 --from-literal=username='<gitlab_username>' --from-literal=password='<access_token>' -n <agent_namespace>
  4. Create a container that uses a private image in my cluster. See steps to reproduce
  5. Run the gitlab-agent spike project from the spike-ocs-private-registry branch
  6. Validate that vulnerabilties are created

Steps to reproduce

  1. Create an access token with scopes write_registry and read_registry

  2. Create a private Gitlab project and push an image to container registry

  3. Create a kubernetes secret in your cluster using your gitlab username and the access token created above

    kubectl create secret docker-registry my-secret --docker-server=registry.gitlab.com --docker-username=<gitlab_username> --docker-password=<access_token> --docker-email=<email>
  4. Create a pod with the private registry image from step 1 and the secret create in step 3

    apiVersion: v1
    kind: Pod
    metadata:
      name: pingpong
    spec:
      containers:
        - name: pingpong
          image: registry.gitlab.com/smtan/ocs_test_private_registry/alpine:3.11.3
          command: ["ping", "1.1.1.1"]
          imagePullPolicy: Always
      imagePullSecrets:
        - name: my-secret
    
  5. Start an OCS scan targetting the namespace where the pod was created

  6. Comment out the delete pod code in OCS to prevent the Trivy scanner pod from being deleted after the scan is complete.

  7. Print the logs of the Trivy scanner pod and to see the error

    Screenshot 2023-06-15 at 2.12.36 PM.png

Possible fixes

Trivy's docs indicates that it's possible to use the TRIVY_USERNAME and TRIVY_PASSWORD ENV variables or the --username and --password CLI flags to pass in private registry credentials. It looks like the flags are implemented in registry_flag.go and it supports sets of credentials.

When I tested, it does not seem like the trivy k8s command in the latest version of Trivy 0.42.1 supports these ENV variables or CLI flags yet. However, when I manually tested the CLI flags with the trivy image command it was able to scan the private image that I created above.

I also found this PR merged 3 days ago that adds CLI flags to the trivy k8s command, but it has yet to be built as part of the latest Trivy image.

Based on these details I can think of the following fixes:

Approach 1:

  • For each pod, check if there's an associated secret and use an equivalent of kubectl get secrets in go to retrieve the username and password
  • If there are multiple pods with different secrets, append them together
  • Pass the username and password as CLI flags in the Trivy scanner podspec in OCS
  • Note that I've also not actually tested if this actually works yet
    • I plan to build an image with the latest Trivy code to test this approach tomorrow.
  • Pros
    • This approach does not require any additional configuration from the user
  • Cons(potential)
    • I'm unsure if this breaks any Kubernetes convention?

Approach 2:

  • Expose username and password ENV variables to the user for them to pass in through the gitlab agent helm chart.
  • Append the username and password as ENV variables in the Trivy scanner podspec in OCS
  • Note that I've also not actually tested if this actually works yet
  • This is pending testing. I can confirm this once Trivy publishes the latest image or when I figure out how to build an image from source for use.
  • Pros
    • Customer has direct control
  • Cons
    • Customers would need to manually configure the credentials via env var

Striking out Approach 1 and 2 since it leaks the credentials to anyone who can see the pod object.

cc @smeadzinger @gonzoyumo

Edited by Sara Meadzinger