Skip to content

Usage Ping for DAST Full scans

Problem to solve

Active Scan for DAST lacks the Usage Ping instrumentation due to the performance implications. Still, we need to collect information on the usage of this feature.

Further details

It's a follow-up after https://gitlab.com/gitlab-org/gitlab-ee/issues/7182.

Most likely, it would be feasible to implement this after efficient counters are implemented for GitLab web app.

Proposal

Implement the usage tracking for this feature: collect the number of pipelines executed with dast job enabled and DAST_FULL_SCAN_ENABLED environment variable

What does success look like, and how can we measure that?

Usage data for DAST Full scan pipelines is collected without performance problems or downtimes at GitLab.com and 10K self-managed instances.

Links / references

Current status

Current database layout and size of the ci_builds table makes it impossible to have a working implementation of Usage Ping on instances like GitLab.com and larger.

Reasons:

  • As for now, there is no effective means to mark a build in the database as originating from a secure job and/or having a particular ENV var set (e.g. a flag column or a metadata attribute in jsonb column). By "effective" I mean indexed and allowing to execute a query to find such builds within default query timeout.
  • There is a partial index against name column in ci_builds table that is currently used to find security job builds there. While being pretty fragile by itself (think different job name than sast, dast etc.), it still doesn't help to fit into the query timeout.
  • These SQL queries are constructed as raw COUNT queries because that's how the UsageData is constructed and sent to version.gitlab.com; no time-framed queries (COUNT for last month etc.) or any other partitioning supported
Edited by Victor Zagorodny