Skip to content

Increase packages_pypi_metadata.keywords text limit

What does this MR do and why?

We recently started uploading more Pypi metadata fields in !131327 (merged).

Unfortunately, because we added a size limit on the keywords field, packages with many keywords hit this size limit. When a Pypi package with a large keywords array in pyproject.toml is uploaded, we return an error response:

Uploading python_benedict-0.33.1-py3-none-any.whl
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB • 00:00 • 133.5 MB/s
INFO     Response from http://gdk.test:3000/api/v4/projects/47/packages/pypi:                                                                                                                               
         400 Bad Request                                                                                                                                                                                    
INFO     {"message":"400 Bad request - Validation failed: Keywords is too long (maximum is 255 characters)"}                                                                                                
ERROR    HTTPError: 400 Bad Request from http://gdk.test:3000/api/v4/projects/47/packages/pypi                                                                                                              
         Bad Request   

This MR does two things:

  • Increase the keywords limit to 1024, to accommodate Pypi packages with many keywords
  • truncate keywords if it exceeds 1024 characters, so that Pypi packages that hit the limit can still be uploaded, with the truncated keywords in the metadata.

Database migration logs

up
main: == [advisory_lock_connection] object_id: 111600, pg_backend_pid: 84139
main: == 20240219135601 UpdatePypiMetadataKeywodsCheckConstraint: migrating =========
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("ALTER TABLE packages_pypi_metadata\nADD CONSTRAINT check_222e4f5b58\nCHECK ( char_length(keywords) <= 1024 )\nNOT VALID;\n")
main:    -> 0.0017s
main: -- execute("SET statement_timeout TO 0")
main:    -> 0.0003s
main: -- execute("ALTER TABLE packages_pypi_metadata VALIDATE CONSTRAINT check_222e4f5b58;")
main:    -> 0.0010s
main: -- execute("RESET statement_timeout")
main:    -> 0.0003s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("            ALTER TABLE packages_pypi_metadata\n            DROP CONSTRAINT IF EXISTS check_02be2c39af\n")
main:    -> 0.0009s
main: == 20240219135601 UpdatePypiMetadataKeywodsCheckConstraint: migrated (0.0648s) 

main: == [advisory_lock_connection] object_id: 111600, pg_backend_pid: 84139
ci: == [advisory_lock_connection] object_id: 111960, pg_backend_pid: 84141
ci: == 20240219135601 UpdatePypiMetadataKeywodsCheckConstraint: migrating =========
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("ALTER TABLE packages_pypi_metadata\nADD CONSTRAINT check_222e4f5b58\nCHECK ( char_length(keywords) <= 1024 )\nNOT VALID;\n")
ci:    -> 0.0017s
ci: -- execute("SET statement_timeout TO 0")
ci:    -> 0.0002s
ci: -- execute("ALTER TABLE packages_pypi_metadata VALIDATE CONSTRAINT check_222e4f5b58;")
ci:    -> 0.0004s
ci: -- execute("RESET statement_timeout")
ci:    -> 0.0002s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("            ALTER TABLE packages_pypi_metadata\n            DROP CONSTRAINT IF EXISTS check_02be2c39af\n")
ci:    -> 0.0005s
ci: == 20240219135601 UpdatePypiMetadataKeywodsCheckConstraint: migrated (0.0239s) 

ci: == [advisory_lock_connection] object_id: 111960, pg_backend_pid: 84141
main: == [advisory_lock_connection] object_id: 112180, pg_backend_pid: 84144
main: == 20240222000000 RemovePackagesProtectionRulesPackageNamePatternIlikeQueryColumn: migrating 
main: -- column_exists?(:packages_protection_rules, :package_name_pattern_ilike_query)
main:    -> 0.0037s
main: == 20240222000000 RemovePackagesProtectionRulesPackageNamePatternIlikeQueryColumn: migrated (0.0105s) 

main: == [advisory_lock_connection] object_id: 112180, pg_backend_pid: 84144
ci: == [advisory_lock_connection] object_id: 112260, pg_backend_pid: 84146
ci: == 20240222000000 RemovePackagesProtectionRulesPackageNamePatternIlikeQueryColumn: migrating 
ci: -- column_exists?(:packages_protection_rules, :package_name_pattern_ilike_query)
ci:    -> 0.0044s
ci: == 20240222000000 RemovePackagesProtectionRulesPackageNamePatternIlikeQueryColumn: migrated (0.0205s) 

ci: == [advisory_lock_connection] object_id: 112260, pg_backend_pid: 84146
down
main: == [advisory_lock_connection] object_id: 117760, pg_backend_pid: 87014
main: == 20240214204805 MakeFindingIdNotNull: reverting =============================
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- transaction_open?(nil)
main:    -> 0.0000s
main: -- execute("            ALTER TABLE vulnerabilities\n            DROP CONSTRAINT IF EXISTS check_4d8a873f1f\n")
main:    -> 0.0017s
main: == 20240214204805 MakeFindingIdNotNull: reverted (0.0165s) ====================

main: == [advisory_lock_connection] object_id: 117760, pg_backend_pid: 87014
ci: == [advisory_lock_connection] object_id: 117820, pg_backend_pid: 87399
ci: == 20240214204805 MakeFindingIdNotNull: reverting =============================
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- transaction_open?(nil)
ci:    -> 0.0000s
ci: -- execute("            ALTER TABLE vulnerabilities\n            DROP CONSTRAINT IF EXISTS check_4d8a873f1f\n")
ci:    -> 0.0013s
ci: == 20240214204805 MakeFindingIdNotNull: reverted (0.0201s) ====================

ci: == [advisory_lock_connection] object_id: 117820, pg_backend_pid: 87399

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

No UI changes 🌈

How to set up and validate locally

A. Verify that a package with keywords longer than 255 characters can be uploaded

  1. Install the prerequisites
  1. Clone this python package
  2. Build the package. From the package directory, run python3 -m build
  3. Setup authentication following our guide
  4. Upload the package. From the package directory, run python3 -m twine upload --verbose --repository gitlab dist/*. NOTE: The upload will fail if you try to upload a version that already exists. Delete the existing version in Package Registry, or build and upload a different version.

Expected response when running the MR branch:

Uploading python_benedict-0.33.1-py3-none-any.whl
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB • 00:00 • 134.5 MB/s
INFO     Response from http://gdk.test:3000/api/v4/projects/47/packages/pypi:                                                                                                                               
         201 Created                                                                                                                                                                                        
INFO     {"message":"201 Created"}                                                                                                                                                                          
Uploading python-benedict-0.33.1.tar.gz
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.8/99.8 kB • 00:00 • 138.6 MB/s
INFO     Response from http://gdk.test:3000/api/v4/projects/47/packages/pypi:                                                                                                                               
         201 Created                                                                                                                                                                                        
INFO     {"message":"201 Created"}  
  1. Verify that the package metadata was populated. From a Rails console, run ::Packages::Package.last.pypi_metadatum.keywords. The metadatum record should have been created, with the keywords field set to the contents of the keywords key in pyproject.toml

In the master branch, the upload will fail:

Uploading python_benedict-0.33.1-py3-none-any.whl
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.9/97.9 kB • 00:00 • 133.5 MB/s
INFO     Response from http://gdk.test:3000/api/v4/projects/47/packages/pypi:                                                                                                                               
         400 Bad Request                                                                                                                                                                                    
INFO     {"message":"400 Bad request - Validation failed: Keywords is too long (maximum is 255 characters)"}                                                                                                
ERROR    HTTPError: 400 Bad Request from http://gdk.test:3000/api/v4/projects/47/packages/pypi                                                                                                              
         Bad Request        

B. Verify that a package with keywords longer than 1024 characters can be uploaded

  1. Modify the pyproject.toml file from the previous section. Add keywords until the keywords array, when converted to a string, is longer than 1024 characters. Duplicate keywords are allowed, so you can simply paste and copy lines 12-63 of pyproject.toml repeatedly into the keywords array.
  1. Modify the value of __version__ in benedict/metadata.py
  2. Cleanup the previous build arfitacts: rm dist/*
  3. Rebuild the package: python3 -m build
  4. Upload the new package version: python3 -m twine upload --verbose --repository gitlab dist/*
  5. Verify that the keywords field was populated and truncated. From a Rails console, run ::Packages::Package.last.pypi_metadatum.keywords.length.
[3] pry(main)> ::Packages::Package.last.pypi_metadatum.keywords.length
=> 1024

Related to #440402 (closed)

Edited by Radamanthus Batnag

Merge request reports