Dependency resolution for maven projects in CI pipelines (#588765) · Issues · GitLab.org / GitLab

Dependency resolution for maven projects in CI pipelines

**What is Dependency Resolution (formerly called "build support")?** In the context of Dependencey Scanning, what we used to call "build support" refers to the mean(s) of obtaining the project’s dependency list or graph. This does not necessarily require to build the project exactly as the user would to release it or deploy it. However, building the dependency list often calls for similar requirements to have a working build environment, to be able to execute package manager, build tool, or 3rd party tool commands. For instance, being able to execute the `mvn dependency:tree` command to build a maven project's dependency list. Going forward, we will call this functionality **dependency resolution** to avoid the confusion and better indicate the goal.  ### Release notes  ### Problem to solve  GitLab's dependency scanning relies on lockfiles or graphfiles as the entry point for an accurate dependency detection and vulnerability analysis. However, maven projects require some form of dependency resolution mechanism to generate these files. Unlike the legacy Gemnasium analyzer which supports that capability, the original Dependency Scanning analyzer delegated this responsibility to users, expecting them to generate lockfiles/graphfiles in preceding CI jobs. While this approach offers flexibility, it presents several challenges: - **User Experience Gaps**: Many users expect dependency scanning to work out-of-the-box without manual CI configuration. The requirement to set up custom build jobs creates friction for adoption and increases the barrier to entry for security scanning. - **Limitations to enablement at scale**: [Scan Execution Policies](https://docs.gitlab.com/ee/user/application_security/policies/scan_execution_policies.html) enforce security scanning across projects. However, without this necessary dependency resolution step, projects without pre-existing lockfiles or ad-hoc customization could not benefit from dependency scanning analysis. ### Proposal Following the outcomes from [the internal spike](https://gitlab.com/gitlab-org/gitlab/-/work_items/582607), implement automatic dependency resolution support for maven projects. The proposal is further described in the architecture design document (WIP): https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223 ### Implementation plan 1. update the CI configuration 1. V2 Dependency-Scanning CI/CD template: https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Jobs/Dependency-Scanning.v2.gitlab-ci.yml 2. Dependency Scanning CI/CD component: https://gitlab.com/components/dependency-scanning/-/blob/main/templates/main/template.yml 2. update the DS analyzer 1. Introduce a new command to support the service mode with dependency detection and script generation 3. implement a testing framework to cover the E2E workflow with resolution jobs using service container and interaction with dependency scanning job 4. update the documentation #### considerations for refinement 1. **Requirements**: 1. **What happens if one of the vanilla image that we are using has a critical vulnerability? I guess we will have instructions that users can overwrite the image right? See conversation in https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223#note_3057099258** 1. See dedicated discussion: https://gitlab.com/gitlab-org/gitlab/-/work_items/588765#note_3113698586 2. **Should we consider copying vanilla images into GitLab.com container registry? For availabilty and security reasons (e.g. being able to patch these images if upstream maintainer is too slow to do it). Consider maintenance and egress cost implications. See https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223#note_3059920363** 1. See dedicated discussion: https://gitlab.com/gitlab-org/gitlab/-/work_items/588765#note_3113698586 3. **How to handle partial lockfile presence? For instance a monorepo with multiple gradle subprojects where some teams commit the lockfile and some teams don't. Should we run the resolution job as soon as there is a compatible manifest file and let the detection logic figure out if a lockfile is present and skip? Or should we have a rule that skips the entire CI job if any lockfile for that technology is present in the repo? Consider impact on enable\_/disable\_/enforce\_ options we can provide in the inputs or CI/CD variables.** 1. See dedicated discussion: https://gitlab.com/gitlab-org/gitlab/-/work_items/588765#note_3129776888 5. What happens when users have existing build jobs that generate lockfiles in the build stage? The resolution jobs in `.pre` will have already run by then. Should we document that users with dynamic lockfile generation should disable automatic resolution? Again, consider impact on enable\_/disable\_/enforce\_ options we can provide in the inputs or CI/CD variables. 6. Should timeouts be configurable? (e.g. file presence check timeout). Consider convention over configuration. 1. We should limit configuration to what's really necessary and relevant. If there is not reason for huge variance on timeout, it should not be configurable. For instance, the time to spin up a service container, mount the shared volume and checkout the repository is likely to be quite similar across the board. We can also add reasonable margin to handle potential variance. And if this really becomes an issue for a specific customer we can futher investigate and eventually add an option down the road. 2. **CI configuration** 1. **Consider using `sh -x resolve.sh` rather than `cat` the whole before execution.** 1. Sure :thumbsup: 2. **What does the process look like if we need to upgrade or change the vanilla image we use in a resolution job?** 1. The chosen image will likely receive regular patches and updates. Pinning to a particular minor or patch version would lead to maintenance burden for customers. Like for the analyzer images, pinning to a major version sounds reasonable for the expected usage. However, this must be clearly stated and customers who want to pin down to a particular version can do so. 2. Modifying the default value of the image must be backward compatible, otherwise we might break existing customer's Dependency Resolution jobs. However, it is practically impossible to validate that a given image update will have no effect. We can't predict the impact on other self-managed and dedicated platform when they will upgrade. As a result, modifying this default value would likely be considered a breaking change and require an exception approval by following [the deprecation process](https://docs.gitlab.com/development/deprecation_guidelines). We could possibly lean on [the 3rd party dependencies definition](https://docs.gitlab.com/update/terminology/#third-party-dependencies) which provides more flexibility. 3. **Can we anticipate the implications of adding support for advanced detection logic that will alter the resolution job image? E.g. with a preceding detection job that changes a CI/CD variable.** 1. For next iteration(s) we've considered adding a preceding job that does some kind of detection and export a `dotenv` report artifact with the intent to change which image is used on the dependency resolution job. This would require to achieve job orchestration within the same `.pre` stage, using `needs` keyword. This means resolution job definition must be updated to declare such dependency, which presents a risk in terms of backward compatibility with customer's overrides. Seeking support for monorepositories with heterogeneous runtime version needs will certain call for deeper changes in the job definitions, and thus probably a v3 template. 2. Alternatively such detection could run on setup, on-demand, or periodically in the rails backend and set the corresponding CI/CD variables at the project level. This would have no impact on CI job definitions. However this would certainly not cover well monorepositories with heterogeneous runtime version needs either. 3. **DS Analyzer** 1. Should we support [service health check](https://docs.gitlab.com/ci/services/#how-the-health-check-of-services-works)? If yes, do we need to serve an http endpoint or are there alternatives we can consider (e.g. possibly suggesting improvement to the health check logic) 1. Yes, otherwise users will always see a warning from the runner. Suggesting an improvement is a good idea and should more than likely be a follow-up task. 2. Should we consider explicit HTTP communication between service and main job (instead of file presence check)? Balance pros agains cons like requirement to have http client in vanilla images (e.g. curl). 1. There are 2 main issues with explicit http communication in my opinion: 1. Puts requirement on the job image to have a way to send network calls. From my (limited) research the options for network communication tools are varied but there aren't that many that we can't test several likeliest candidates in the very beginning of the script. For example (note: this is only linux) we could check for `curl`, `wget`, `nc` and in a pinch use a simple command utilizing `/dev/tcp` 2. The reason for using http communications seems kind of exaggerated to me. The main thing we're doing is searching the filesystem. Looking at the same filesystem but communicating over http seems kind of overkill 3. Should we use a binary to run the detection logic and instrument build command rather than running detection in the service and generating a shell script? 1. This one I'm on the fence about. It's much more powerful to run the binary directly but there are constraints on the image (architecture, OS) and it is not as transparent as executing a simple script. A potential solution to this is to always run resolve image commands via a script which may invoke the binary. This gives the option of Going the script invocation route gives us both options without changing the primary interface. 4. How to expose service logs to the main log ouptut (e.g. write in service.log and cat before executing the build commands. NB: this can be added to the generated script rather than written in the main job's script definition). See some details in https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223#note_3058185240 * The logs from service are in their own section so it doesn't clutter output too much. But if we want more verbosity we may also consider writing them to a file and exposing it as artifact. Proposal: do this in 2 stages. Stage 1: write to job log. Stage 2: in case more verbosity is needed - add option to export it as log file. 5. How to best handle the scope of the detection logic (e.g. `--project-types="maven"`) to only generate lockfile for the relevant technology in a given resolution job (multi-techno project with maven+python for instance). Is supporting a list meaningful? (e.g. being able to resolve both maven and gradle in the same job if we can use the same image or if a customer has a single custom build job where it generates everything) 1. Using a custom build job to build multiple projects is outside the scope of current discussions, so we should approach this iteratively as well. Use `--project-types` to tell analyzer what kind of resolution command we want, ensure the variable can be multi-value, but only add support when needed. 6. Consider supporting existing options that affects the dependency detection logic: * Variables to consider - DS_EXCLUDED_PATHS - DS_MAX_DEPTH - PIP_REQUIREMENTS_FILE - DS_INCLUDE_DEV_DEPENDENCIES - GEMNASIUM_IGNORED_SCOPES - other from gemnasium maven plugin? * DS_JAVA_VERSION * **Thought:** These options divide into filtering (i.e. do not consider a particular project) vs output (i.e. remove some dependencies from the SBOM). Any options that we can pass to maven directly would reduce complexity of our implementation and are more user friendly because users can run them directly as well. Mose of the DS filtering options are not really available to `maven` and are a quite powerful addition for dependency resolution and should likely remain. * **We should support** * DS_EXCLUDED_PATHS - filtering * DS_MAX_DEPTH - filtering * DS_INCLUDE_DEV_DEPENDENCIES - output * **We should not support** * GEMNASIUM_IGNORED_SCOPES - this was an output variable applied in `gemnasium-maven-plugin` but is actually better as a `MAVEN_CLI_OPTS` variable which can be passed to maven directly via `-Dscope=X`. * DS_JAVA_VERSION - not relevant anymore * **Not relevant to maven** * PIP_REQUIREMENTS_FILE 4. **Validation and testing** 1. Verify options that should just work because they are available to the environment where we execute native build tools: - MVN_CLI_OTPS - PIP_INDEX_URL - PIP_EXTRA_INDEX_URL 2. Verify support for multi-modules projects (parent manifest) 3. Verify support for monorepo layouts (single or multi-technology) 4. Verify what happens when resolution logic takes too long to run? What mitigation can we provide to improve OOTB support? What other manual solution can we offer to customers impacted by this? See https://gitlab.com/gitlab-com/content-sites/handbook/-/merge_requests/18223#note_3058158118 ### Intended users  ### Feature Usage Metrics  ### Does this feature require an audit event?

issue