Skip to content

Strip leading whitespace from Pypi simple index output

🌱 Context

When downloading a Python package with pip install, pip parses the reply of the simple index endpoint. Before parsing, it verifies if it is a valid HTML5. If pip sees signs that the endpoint response is not a valid HTML5 document it does an early exit. As a result, pip will fail to install the Python package.

Our Pypi simple endpoint produces valid HTML5. But pip thinks it's not. Here's how pip checks if the endpoint output is a valid HTML 5:

if actual_start.decode(encoding).lower() != "<!doctype html>":

As you can see, it does not strip leading whitespaces before checking that the document starts with the DOCTYPE HTML declaration. Unfortunately, our endpoint response has a leading whitespace.

What does this MR do and why?

This updates the strips the whitespace from the endpoint output to make pip happy.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

No UI changes 🌈

How to set up and validate locally

🚧 Prerequisites

🏎 Testing the changes

  1. Download a pypi package: pip install -U <packagename> --index-url http://__token__:<personal-access-token>@gdk.test:3000/api/v4/projects/<project_id>/packages/pypi/simple
  2. To re-test and download the package again, you'll need to uninstall it first: pip uninstall <packagename>

Expected results:

master: DEPRECATION warning is shown

Looking in indexes: http://__token__:****@gdk.test:3000/api/v4/projects/7/packages/pypi/simple
DEPRECATION: The HTML index page being used (http://gdk.test:3000/api/v4/projects/7/packages/pypi/simple/mypypipackage/) is not a proper HTML 5 document. This is in violation of PEP 503 which requires these pages to be well-formed HTML 5 documents. Please reach out to the owners of this index page, and ask them to update this index page to a valid HTML 5 document. pip 22.2 will enforce this behaviour change. Discussion can be found at https://github.com/pypa/pip/issues/10825
Collecting mypypipackage
  Downloading http://gdk.test:3000/api/v4/projects/7/packages/pypi/files/ffaad5cbff26e94129e43f122ee3b686d3c05abb99b47a6c409f02e0042e8df2/mypypipackage-0.0.1-py3-none-any.whl (1.6 kB)
Installing collected packages: mypypipackage
Successfully installed mypypipackage-0.0.1
Reshimming asdf python...

MR: No Deprecation warning

Looking in indexes: http://__token__:****@gdk.test:3000/api/v4/projects/7/packages/pypi/simple
Collecting mypypipackage
  Downloading http://gdk.test:3000/api/v4/projects/7/packages/pypi/files/ffaad5cbff26e94129e43f122ee3b686d3c05abb99b47a6c409f02e0042e8df2/mypypipackage-0.0.1-py3-none-any.whl (1.6 kB)
Installing collected packages: mypypipackage
Successfully installed mypypipackage-0.0.1
Reshimming asdf python...

Related to #471355 (closed)

Edited by Radamanthus Batnag

Merge request reports