gemnasium-python:2 docker image dependencies become outdated and break downstream expectations

Summary

We encountered an issue with dependencies inside the gemnasium-python analyzer drifting over time.

When downstream tests (in python-pip) run against the standard docker image (gemnasium-python:2) the expectation succeeds. However, when a gemnasium-python MR is created, the python-pip expectations are compared against a report generated against the new image created by the MR. This comparison fails b/c the image has a re-generated set of dependencies (see this diff).

This means that we get a cycle where there is either a broken expectation whenever tests run against gemnasium-python master (if we choose to update the expectation in python-pip) or the expectation is broken for all MRs in gemnasium-python.

After some investigation we found that this occurs because the built gemnasium-python docker image comes with a set of its own packages installed during the build of the image (see the Dockerfile - installing pipenv and pipdeptree). gemnasium-python brings in a bunch of dependencies as part of the pipenv install. Once the image is built the dependencies are frozen (as part of the docker image) at the installed version.

When a new image is built, the dependencies are installed fresh at the newer versions (see the outdated list below).

For example, when running pip list outdated against the gemnasium-python:2 image (registry.gitlab.com/gitlab-org/security-products/analyzers/gemnasium-python:2), the output of outdated dependencies is this:

Package            Version   
------------------ ----------
attrs              19.3.0    
certifi            2019.9.11 
ijson              2.5.1     
importlib-metadata 0.23      
isort              4.3.21    
jsonschema         3.1.1     
more-itertools     7.2.0     
packaging          19.2      
pip                19.3.1    
pipdeptree         0.13.2    
pipenv             2018.11.26
pyparsing          2.4.2     
pyrsistent         0.15.4    
setuptools         41.4.0    
six                1.12.0    
virtualenv         16.7.7    
virtualenv-clone   0.5.3     
wheel              0.33.6    
zipp               0.6.0     

The image is now set at the package versions above and install won't go out to the repo to fetch newer versions because the packages already exist.

Steps to reproduce

Please see this MR's pipelines for more info: gitlab-org/security-products/tests/python-pip!36 (closed)

What is the current bug behavior?

gemnasium-python MRs fail downstream with dependency versions mismatching. When you fix the version mismatch, gemnasium-python master fails.

What is the expected correct behavior?

Both master and MR downstream checks ought to pass since no explicit change was made to dependencies.

Possible fixes

There are 2 discussions to be had. First is around the policy when the images drift - can we just force push changes in MRs in spite of a failing expectation in downstream tests? If not, what else can we do to ensure MRs with failing downstream tests get merged?

The second discussion is how to fix this issue with gemnasium-python site-packages version drift. The immediate idea is to run a pip install --upgrade against each of the outdated packages at runtime. However, this may pose problems since the reason for the drift is a very permissive requirements policy in some python packages. For example, the package certifi above comes from pipenv. The dependencies of it look like this:

pipenv==2018.11.26
  - certifi [required: Any, installed: 2019.9.11]
  - pip [required: >=9.0.1, installed: 19.3.1]
  - setuptools [required: >=36.2.1, installed: 41.4.0]
  - virtualenv [required: Any, installed: 16.7.7]
  - virtualenv-clone [required: >=0.2.5, installed: 0.5.3]

Since pipenv requires any certifi (see requirements for pipenv) we could potentially face side-effects down the road if a very permissive package requirement pulls in major version changes.

Edited Apr 21, 2023 by 🤖 GitLab Bot 🤖
Assignee Loading
Time tracking Loading