Skip to content

Add workaround in Container Scanning to allow us to update Trivy without first downloading java-db

Problem to solve

The version of Trivy used in Container Scanning is 0.36.1, however, the most recent version of Trivy is v0.39.0.

Unfortunately, we're currently blocked from being able to upgrade Container Scanning to use a more recent version of Trivy because trivy >= v0.37.0 includes a new feature to automatically download a Java DB when generating an SBOM, which causes problems in an offline environment:

  1. If we attempt use Trivy in an offline environment, then an error is returned:

    ERROR	Unable to initialize the Java DB: Java DB update failed: Java DB update error: oci error: OCI repository error: Get "https://ghcr.io/v2/": dial tcp: lookup ghcr.io on 192.168.65.5:53: write udp 172.17.0.3:41561->192.168.65.5:53: write: operation not permitted
  2. We don't want to include the java-db in the container-scanning image because it adds 679M to the image size.

  3. We can't skip updating the java-db by passing --skip-java-db-update, otherwise an error is returned:

    ERROR	The first run cannot skip downloading Java DB

We need to solve this issue in order to upgrade to more recent versions of Trivy.

Background details

I created the following bug report in the upstream Trivy project: Can't use Trivy v0.38.0 in offline environment without first fetching java-db #3980, however, it seems that this is expected behaviour, so the bug was closed and a feature request created instead: Add ability to disable JAR scanning #3987.

There are currently three different scenarios that trigger a download of the java-db and cause an error in an offline environment:

  1. When CS_DISABLE_DEPENDENCY_LIST is false (the default setting).
  2. When CS_DISABLE_LANGUAGE_VULNERABILITY_SCAN is false (default is true).
  3. When generating an SBOM.

We need to make sure that we come up with an approach that works in an offline environment for all three of the above cases.

Proposal

Here are some possible solutions for this issue:

  1. Complete Add ability to disable JAR scanning in the upstream Trivy project.
    • Pros
      • Allows the behaviour of downloading data to be configured.
      • Works for both offline and online instances.
    • Cons
      • Need to implement this change in the upstream trivy project, which might not be accepted.
      • High chance of unreported vulnerabilities.
      • Needs additional configuration in offline environments if a user wants to make sure all vulnerabilities are reported.
  2. Add a skeleton java-db to the container scanning image.
    • Pros
      • Easy to implement.
      • User is not forced to download additional data.
      • Works in an offline environment without any additional changes.
    • Cons
      • Can't easily change the behaviour, need to add another environment variable to container scanning to allow this to be overridden, which increases the complexity of the implementation.
      • The default behaviour prevents JAR vulnerabilities from being detected in online instances.
      • High chance of unreported vulnerabilities.
  3. Add a new CS_TRIVY_JAVA_DB environment variable and pass this to trivy using --java-db-repository.
    • Pros
      • Easy to implement.
      • Approach is flexible, since users can modify the CS_TRIVY_JAVA_DB var to point to any data source they want.
      • Vulnerabilities will be reported, as long as they're prsent in the CS_TRIVY_JAVA_DB.
      • Data is only fetched when scanning an image containing JAR files.
      • Works in both offline and online instances.
    • Cons
      • User is forced to download additional data.
      • Needs additional configuration in offline environments.

After discussing this here, approach 3. seems like the best option.

Workaround

The following workaround can be used to upgrade to a more recent version of trivy until we've had a chance to properly solve this issue:

Create a custom Docker file, using registry.gitlab.com/security-products/container-scanning:latest as the base image:

FROM registry.gitlab.com/security-products/container-scanning:latest

ENV TRIVY_VERSION=0.41.0

RUN sudo apt-get update && sudo apt-get install -y wget
RUN  wget --no-verbose https://github.com/aquasecurity/trivy/releases/download/v"${TRIVY_VERSION}"/trivy_"${TRIVY_VERSION}"_Linux-64bit.tar.gz -O - | tar -zxvf - -C /home/gitlab/opt/trivy

Implementation Plan

NOTE: This issue is currently blocked by Add offline tests for Container Scanning (#404557 - closed), since we need offline tests in place to ensure that the implementation works as expected.

  1. Add a new variable named CS_TRIVY_JAVA_DB.
  2. Add a new method called trivy_java_db to environment.rb, which defaults to registry.gitlab.com/gitlab-org/security-products/dependencies/trivy-java-db ghcr.io/aquasecurity/trivy-java-db:1 (see this discussion for details).
  3. Update the scan_command, os_scan_command and sbom_scan_command methods to pass --java-db-repository #{Gcs::Environment.trivy_java_db}.
  4. Document the new CS_TRIVY_JAVA_DB variable in the Container Scanning documentation. Make sure to include details on how to use this in offline instances.
  5. Revert this MR Remove trivy from trigger-scanner-update job (gitlab-org/security-products/analyzers/container-scanning!2911 - merged) since we can now release new versions of container scanning.

/cc @sam.white @gonzoyumo

Edited by Adam Cohen