Operational Container Scanning: allow maximum number of concurrent namespace scans to be configured

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

Operational Container Scanning launches one trivy pod running trivy k8s -n <namespace> for each namespace specified in the agent configuration file.

If there are a several namespaces to scan this results in multiple scan pods being launched at once (up to a hard-coded limit of 10), which has the potential to overload the cluster, especially if the scan pods are configured to run with higher memory and cpu limits to avoid out-of-memory errors and timeouts.

It would be helpful if it were possible to configure a maximum number of scan pods to run at once via the agent configuration file. Users could then set the concurrency to an appropriate level based on their cluster resources and namespace configuration.

As things stand if there are multiple namespaces requiring scanning and cluster resources are constrained the OCS feature may not be usable for some customers.

Requested by customer in support ticket (ZD internal link)

Implementation plan

  1. Add a maxParallelFlag to the agent configuration to replace the hardcoded limit of 10 to enable users to control the max number of namespaces scanned at a time.
    1. The default value should be kept at 10
    2. ThIs MR might be useful to reference how to add a new configuration for OCS.
  2. Add a timeoutFlag to the agent configuration to enable users who have many images in a namespace to extend the scan duration if it timed out.
    1. The default value should be 10 minutes
    2. ThIs MR might be useful to reference how to add a new configuration for OCS.
    3. Replace the hardcoded 10 minutes limit with the timeoutFlag + 5 minutes for processing of the logs.
    4. Configure the Trivy timeout flag with the timeoutFlag by passing it as an environment variable in the podSpec.
  3. Add timeoutFlag and maxParallelFlag to OCS documentation

Development notes

When making changes to agent configuration, make sure to start kas locally as it needs to read the changes to the protobuf file

Edited by 🤖 GitLab Bot 🤖