CI variables to specify files scanned by Dependency Scanning
Problem to solve
In some cases users need to specify the exact files to be scanned by Dependency Scanning.
- These files can't be detected. Example: pip requirements files with custom filenames.
- In a repo that contains multiple supported files, users who want to scan a single file need to exclude all the others.
- This is very tedious.
- This is error prone. In particular, the filters might miss files introduced later on.
- Creating multiple jobs to scan multiple Java or Python files isn't practical. Users would need to create a matrix of jobs that each exclude all other supported files.
Proposal
See &12315 (comment 1738272401)
- Introduce one CI/environment variable per supported file type.
- The new CI variables would end with the following suffixes:
-
_LOCK_FILES
for lock files; example:BUNDLER_LOCK_FILES
-
_DEPENDENCY_FILES
for anything else; exampleMAVEN_DEPENDENCY_FILES
-
- Values are globs similar to the ones supported by
rules:exists
andrules:changes
. - For consistency and extensibility, it would always be plural even when the analyzer can only handle one file per execution. Example:
MAVEN_DEPENDENCY_FILES
.
- The new CI variables would end with the following suffixes:
- The analyzer CLI might have default values for the corresponding environment variables, to work out of the box. However, setting any environment variable would then have the side effect of ignoring all default values. For instance, gemnasium-java ignores default values for
GRADLE_DEPENDENCY_FILES
andSBT_DEPENDENCY_FILES
whenMAVEN_DEPENDENCY_FILES
is set. - Deprecate and drop
DS_MAX_DEPTH
. - TBD: Deprecate and drop
DS_EXCLUDED_PATHS
?
Opportunities
This brings consistency and unblocks feature enhancements.
-
PIP_REQUIREMENTS_FILES
would be an alias forPIP_DEPENDENCY_FILE
. -
Handle requirements.txt files produced by pip-c... (#418321) would introduce support for
PIP_LOCK_FILES
. - Users could scan multiple Java projects by setting
MAVEN_DEPENDENCY_FILES
in a parallel:matrix (requires the editing of the CI config but at least it's clean). Same for Python. - We could support
CONAN_LOCK_FILES
as requested today by a customer.
Users no longer need to override the gemnasium-*
jobs. It prepares the transitions to Gemnasium-based SBOM generators (&8206) and to other SBOM generators. See #434143
These new CI variables would be required inputs of upcoming Dependency Scanning CI/CD components. See [Spike] Composition Analysis components for glo... (#431827 - closed)
These new CI variables could be used in rules:exists
or rules:changes
.
- Changing depth of the search using the CI variables affects both the execution of the scanning job and the scan itself. (Right now a job might be triggered even though all the files are skipped because of
DS_MAX_DEPTH
.) - This improves readability and reduces maintenance.
- The job is skipped if the files specified in the CI variables don't exist (or haven't changed).
dependency_scanning-java:
extends: dependency_scanning
variables:
MAVEN_DEPENDENCY_FILES: "**/pom.xml"
GRADLE_DEPENDENCY_FILES: "**/build.gradle{,.kts}"
SBT_DEPENDENCY_FILES: "**/build.sbt"
rules:
- if: $CI_COMMIT_BRANCH
exists:
- $MAVEN_DEPENDENCY_FILES
- $GRADLE_DEPENDENCY_FILES
- $SBT_DEPENDENCY_FILES
Limitations
The analyzer would scan all supported files by default, but only the requested files when one of these environment variables is set. The catch is that users have to clear CI vars corresponding to package managers they want to ignore.
For example, the following instructs gemnasium-maven to scan web/pom.xml
and to scan any Gradle or sbt file.
variables:
MAVEN_DEPENDENCY_FILES: "web/pom.xml"
GRADLE_DEPENDENCY_FILES: ""
SBT_DEPENDENCY_FILES: ""
This is tedious but it's still an improvement over excluding Gradle and sbt files using DS_EXCLUDED_PATHS
.
Also, in the case of Java we can address that problem by having one job per supported package manager. This might not possible with Python b/c a file can be supported by multiple package managers; having all Python files handled by the same job makes it possible to apply priorities.
Other proposals
A proposal was to pass a comma-separated list of pairs,
where each pairs combine a file path with a file type,
like bundler:x/Gemfile;bundler:y/Gemfile
.
See &12315 (comment 1716829862).
There are significants cons to that approach though.
- It's not user friendly.
- The list can't be used in
rules:changes
orrules:exists
. - There's one more step to turning that list to an array of paths passed to
parallel:matrix
.
The following proposals have been rejected because they introduce CI variables that don't give the file type.
This makes it impossible pass a custom filenames similar to the ones we currently pass using PIP_REQUIREMENTS_FILE
.
proposal B
Proposal B
- Two new CI variables,
DS_JAVA_SCAN_FILES
andDS_PYTHON_SCAN_FILES
will be added for Dependency Scanning to configure one of the following behaviors for Java/Python projects:- When the variable is not set (default value is null to avoid a breaking change) then then only the first detected file will be scanned.
- When the variable is set to
[]
then all detected files in the project will be scanned. - The variables will also accept an array of paths which can be used to list specific files which will be scanned. These paths will have support for basic glob patterns (
*
,**
,?
, and[
).
- When more than one file is to be scanned, either multiple threads or multiple jobs will be used to parallelize the workload and minimize the total scan time.
- When the
PIP_REQUIREMENTS_FILE
variable is set in addition toDS_PYTHON_SCAN_FILES
then the file defined inPIP_REQUIREMENTS_FILE
will be scanned in addition to any identified file(s) due toDS_PYTHON_SCAN_FILES
. - The
PIP_REQUIREMENTS_FILE
variable will be deprecated.
proposal C
Proposal C
- Introduce a new CI variable
DS_SCAN_PATHS
.- This is supported by gemnasium, gemnasium-maven, and gemnasium-python.
- NOTE: At the moment gemnasium-maven and gemnasium-python can only scan one Java and Python project, respectively.
- Deprecate existing variables that overlap with the new one.
PIP_REQUIREMENTS_FILE
This page may contain information related to upcoming products, features and functionality. It is important to note that the information presented is for informational purposes only, so please do not rely on the information for purchasing or planning purposes. Just like with all projects, the items mentioned on the page are subject to change or delay, and the development, release, and timing of any products, features, or functionality remain at the sole discretion of GitLab Inc.