Skip to content

Store metadata information when uploading a PyPi package

What does this MR do and why?

This MR resolves the second of 2 issues under the epic &11344

PyPi packages are uploaded using Twine: python3 -m twine upload --repository gitlab dist/*

Twine does the hard work of parsing the metadata for us. When twine hits the endpoint, we get this in params in the upload endpoint (as of Twine 4.0.2):

{
 "name"=>"hello-pypi",
 "version"=>"0.0.2",
 "filetype"=>"bdist_wheel",
 "pyversion"=>"py3",
 "metadata_version"=>"2.1",
 "summary"=>"A small example package",
 "home_page"=>"",
 "author"=>"",
 "author_email"=>"Rad Batnag <rbatnag@gitlab.com>",
 "maintainer"=>"",
 "maintainer_email"=>"",
 "license"=>"",
 "description"=>"# README\n",
 "keywords"=>"",
 "classifiers"=>"Programming Language :: Python :: 3",
 "download_url"=>"",
 "comment"=>"",
 "sha256_digest"=>"0404b3516f7a011953a199889f9a2378c3ab00118e3189a69538d051deca047b",
 "project_urls"=>"Bug Tracker, https://github.com/pypa/sampleproject/issues",
 "requires_python"=>">=3.7",
 "description_content_type"=>"text/markdown",
 "md5_digest"=>"e61f4bd4d6ab6d87a5a6bf3793d702eb",
 "blake2_256_digest"=>"02171dc2e8c442dfbb3118e492ed6194ebbfba4d16db6d6ef968bda055c9b9cf",
 ":action"=>"file_upload",
 "protocol_version"=>"1",
 "content.md5"=>"e61f4bd4d6ab6d87a5a6bf3793d702eb",
 "content.name"=>"hello_pypi-0.0.2-py3-none-any.whl",
 "content.path"=>"",
 "content.remote_url"=>
  "https://radamanthus-gdk-packages.s3.amazonaws.com/tmp/uploads/1693301393-40212-0004-3024-92cd798e7e99a5ac7a5f40c46041502d?X-Amz-Expires=15300&X-Amz-Date=20230829T092953Z&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVL2JV4V7HRAXJ4FN%2F20230829%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-SignedHeaders=host&X-Amz-Signature=88222156e481752067c33ced7852215616c481d6b85e11b79ee328bfedfd30bf",
 "content.size"=>"998",
 "content.upload_duration"=>"1.882007875",
 "content.sha512"=>"b9bc9c643b6aa63182066465746aae7882b40c83777bddc90b0d19b62bd1df10aca8a8343912dc39d8685b3f24d2e748810534dd686fcd4950a6ea72c9845f38",
 "content.remote_id"=>"1693301393-40212-0004-3024-92cd798e7e99a5ac7a5f40c46041502d",
 "content.sha256"=>"0404b3516f7a011953a199889f9a2378c3ab00118e3189a69538d051deca047b",
 "content.sha1"=>"a1b1fc08782cc39f5b6e7887dc8b6c2560a7dc2f",
 "content.gitlab-workhorse-upload"=>
  "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ1cGxvYWQiOnsibWQ1IjoiZTYxZjRiZDRkNmFiNmQ4N2E1YTZiZjM3OTNkNzAyZWIiLCJuYW1lIjoiaGVsbG9fcHlwaS0wLjAuMi1weTMtbm9uZS1hbnkud2hsIiwicGF0aCI6IiIsInJlbW90ZV9pZCI6IjE2OTMzMDEzOTMtNDAyMTItMDAwNC0zMDI0LTkyY2Q3OThlN2U5OWE1YWM3YTVmNDBjNDYwNDE1MDJkIiwicmVtb3RlX3VybCI6Imh0dHBzOi8vcmFkYW1hbnRodXMtZ2RrLXBhY2thZ2VzLnMzLmFtYXpvbmF3cy5jb20vdG1wL3VwbG9hZHMvMTY5MzMwMTM5My00MDIxMi0wMDA0LTMwMjQtOTJjZDc5OGU3ZTk5YTVhYzdhNWY0MGM0NjA0MTUwMmQ_WC1BbXotRXhwaXJlcz0xNTMwMFx1MDAyNlgtQW16LURhdGU9MjAyMzA4MjlUMDkyOTUzWlx1MDAyNlgtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2XHUwMDI2WC1BbXotQ3JlZGVudGlhbD1BS0lBVkwySlY0VjdIUkFYSjRGTiUyRjIwMjMwODI5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3RcdTAwMjZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3RcdTAwMjZYLUFtei1TaWduYXR1cmU9ODgyMjIxNTZlNDgxNzUyMDY3YzMzY2VkNzg1MjIxNTYxNmM0ODFkNmI4NWUxMWI3OWVlMzI4YmZlZGZkMzBiZiIsInNoYTEiOiJhMWIxZmMwODc4MmNjMzlmNWI2ZTc4ODdkYzhiNmMyNTYwYTdkYzJmIiwic2hhMjU2IjoiMDQwNGIzNTE2ZjdhMDExOTUzYTE5OTg4OWY5YTIzNzhjM2FiMDAxMThlMzE4OWE2OTUzOGQwNTFkZWNhMDQ3YiIsInNoYTUxMiI6ImI5YmM5YzY0M2I2YWE2MzE4MjA2NjQ2NTc0NmFhZTc4ODJiNDBjODM3NzdiZGRjOTBiMGQxOWI2MmJkMWRmMTBhY2E4YTgzNDM5MTJkYzM5ZDg2ODViM2YyNGQyZTc0ODgxMDUzNGRkNjg2ZmNkNDk1MGE2ZWE3MmM5ODQ1ZjM4Iiwic2l6ZSI6Ijk5OCIsInVwbG9hZF9kdXJhdGlvbiI6IjEuODgyMDA3ODc1In0sImlzcyI6ImdpdGxhYi13b3JraG9yc2UifQ.Oep9KdSOk--yv3bp-XOofU_aggnI7w1JNWOMrY7ztkI",
 "content"=>
  #<UploadedFile:0x00000002cd41cf10
   @content_type="application/octet-stream",
   @md5="e61f4bd4d6ab6d87a5a6bf3793d702eb",
   @original_filename="hello_pypi-0.0.2-py3-none-any.whl",
   @remote_id="1693301393-40212-0004-3024-92cd798e7e99a5ac7a5f40c46041502d",
   @sha1="a1b1fc08782cc39f5b6e7887dc8b6c2560a7dc2f",
   @sha256="0404b3516f7a011953a199889f9a2378c3ab00118e3189a69538d051deca047b",
   @size=998,
   @upload_duration=1.882007875>,
 "id"=>"28"}

We already introduced new metadata columns in !131013 (merged).

In this MR, we'll update the endpoints to capture the metadata information and save them to the database.

Solution Details 🚑

  • Modify the upload endpoint to capture the additional metadata columns sent by Twine
  • Modify Packages::PyPi::CreatePackageService to capture the additional metadata columns
  • For Backwards compatibility across updates, displaying the additional metadata fields in the UI will be done in a separate MR, scheduled for the milestone after the one the releases this MR

Screenshots or screen recordings

NA. There are no UI changes for this MR - the UI changes are in the follow-up MR !136073 (merged)

How to set up and validate locally

  1. Install the prerequisites
  1. Prepare a Python package following our guide, or download this barebones project
  2. Build the package. From the package directory, run python3 -m build
  3. Setup authentication following our guide
  4. Upload the package. From the package directory, run python3 -m twine upload --verbose --repository gitlab dist/*. NOTE: The upload will fail if you try to upload a version that already exists. Delete the existing version in Package Registry, or build and upload a different version.
  5. Verify that the package metadata was populated. From a Rails console, run ::Packages::Package.last.pypi_metadatum. The author and description fields from pyproject.toml should have been used to populate the PyPi metadata.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #423534 (closed)!

Edited by Radamanthus Batnag

Merge request reports