Handle requirements.txt files produced by pip-compile as lock files
Why are we doing this work
Dependency Scanning assumes that requirement.txt
files
do not include the entire dependency graph, and will always attempt to build a project when
one is detected. This decision was consciously made because, while it's possible for a project
to export a complete list of dependencies in the file, it was not guaranteed to be the case.
Thus, to err on the side of caution, the analyzer was configured so that it would take the safe
route, and build the project, i.e. install all its dependencies. Some tradeoffs were made with
this decision:
- Offline and limited network installations would require either preloading all the dependencies into the package cache or a configured python registry/proxy.
- The analyzer would need to build the project which introduces some complexity in terms Python version compatibility.
With the introduction of the pip-tools
suite, it's now possible to easily generate a complete
dependency graph export in the form of a requirements.txt
. As a result, many projects have
started to adopt this approach, and solely rely on pip
to install from this file. Dependency Scanning
should take advantage of this movement as well, and aim to provide support for projects that
have adopted this workflow. It will not only increase the range of use cases it supports, but it
will also reduce the complexity of setup for projects that use this setup in an offline environment,
and even decrease the time and network bandwidth used when building the project.
Relevant links
Non-functional requirements
-
Documentation: The documentation in Dependency Scanning will need to reflect the new strategy for scanning requirements.txt files. -
Feature flag: -
Performance: -
Testing: Unit tests and integration tests needed. -
Confirm correctness of parsing a requirements.txt
file. -
Confirm that the analyzer will build requirements.txt
files if they're not built bypip-compile
. -
Confirm that we will still handle custom requirements.txt files. This is configured using the PIP_REQUIREMENTS_FILE
env variable.
-
Implementation plan
-
MR 1: Create a new parser that can parse a pip-compile
requirements file.-
Create a directory named pip-compile
in thescanner/parser/
directory. The directory structure will look like the figure below. Theexpect/
directory holds expectations we compare against in tests,fixtures/
is source files we use in tests, e.g.requirements.txt
, and the Go files hold the code related to the parser.├──expect ├──fixtures ├──pip_compile.go └──pip_compile_test.go
-
Implement a parser that parses the versions of packages used in the requirements file. -
Register the parser so that it scans the requirements.txt
file. An example of how this is done for thegolang
parser can be found here.
-
-
MR 2: Update the pip builder so that it returns a non-fatal error if it matches a pip-compile
requirements.txt.-
You can return a non-fatal error by using builer.NewNonFatalError
that's defined here. -
A heuristic will be needed to create a simple but effective solution that detects pip-compile files. One way to do this could be to use a buffered IO reader that scans the file line by line looking for a well known pip-compile
comment left in the output files.# This file is autogenerated by pip-compile with Python
-
If the heuristic matches, the builder should then return the non-fatal error with a message like Python pip project not built. A requirments file built by pip-compile detected.
. This will give customers and team members better insight as to why the build was skipped. -
Add specs that test these scenarios. Our integration tests for this are stored in the spec/gemnasium-python_integration_spec.rb
. They utilizerspec
and theintegration-test
project to test the various scenarios.
-
Verification steps
- Create a project with a requirements.txt file that is produced using
pip-compile
. - Test that this works when running the tests offline. This is a quick test that confirms the project is not built and as a result no dependencies are fetched from the network.
$ docker build -f build/gemnasium-python/redhat/Dockerfile -t gemnasium-python:latest .
$ docker run --rm -it -v "$TEST_PROJECT_SRC>:/app" -w /app -e SECURE_LOG_LEVEL=debug gemnasium-python:latest
- Verify that the dependency scanning report and SBoM contain expected dependencies with the right attributions.