License Finder should change the way it installs pip

Summary

Currently, license-finder installs pip via a curl call to bootstrap.pypa.io: https://gitlab.com/gitlab-org/security-products/analyzers/license-finder/-/blob/main/config/software/asdf_python.rb#L66

And while the domain itself seems to belong to the python foundation there is a whole range of attacks that become possible when we blindly curl to a url and install binary data (see data variable in https://bootstrap.pypa.io/pip/3.3/get-pip.py). There is not even a checksum done on the downloaded file.

Improvements

By downloading a trusted version of pip we eliminate a whole range of vulnerabilities. Here are some:

  • download and store a verified version of pip with the analyzer
  • install pip from a trusted source (like debiansource)
  • the simplest would be to checksum the current download and always check this download before continuing with the install process

Risks

Updates to pip are complicated if pip is stored with the analyzer.

Involved components

license-finder