Skip to content

Gem package processing and metadata extraction

Steve Abrams requested to merge 301175-gemfile-extraction-service into master

🏛 Context

This is the last step in adding the ability to upload Ruby Gems to the GitLab package registry. Currently, users can upload gems, however the gem arrives with a generic name, and needs to be processed after it is uploaded in order to extract it's name, version, dependencies, and any other metadata that we might want to display to users.

The overall RubyGems feature is being tracked here.

🔍 What does this MR do?

  • Adds a set of services that processes the gem by doing the following
    • Extract and store metadata
    • Save the gemspec file as it's own package file
    • Extract and create associated dependencies to the package
    • Update the package name and version, and the package file name
  • Adds a worker to kick off these services when a gem is pushed
  • Updates the Grape API to start this service when a gem is pushed
  • Update the packages_rubygems_metadata table to expand the length constraint on the metadata column to be able to hold real data.

Note: The API is behind a feature flag.

🚫 What this MR does not do

  • With the exception of the package name and version, the metadata and dependencies extracted are not yet visible in the UI, that will be handled in a separate MR.
  • The RubyGems API is behind a feature flag, so no documentation is added. If you are curious how this fits in with the larger feature, there is a draft of the documentation here.

🐘 Database

Up migration:

== 20210223230600 UpdateRubygemsMetadataMetadata: reverting ===================
-- execute("ALTER TABLE packages_rubygems_metadata\nDROP CONSTRAINT IF EXISTS check_ea02f4800f\n")
   -> 0.0031s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0004s
-- execute("ALTER TABLE packages_rubygems_metadata\nADD CONSTRAINT check_ea02f4800f\nCHECK ( char_length(metadata) <= 255 )\nNOT VALID;\n")
   -> 0.0084s
== 20210223230600 UpdateRubygemsMetadataMetadata: reverted (0.0406s) ==========

Down migration:

== 20210223230600 UpdateRubygemsMetadataMetadata: migrating ===================
-- execute("ALTER TABLE packages_rubygems_metadata\nDROP CONSTRAINT IF EXISTS check_ea02f4800f\n")
   -> 0.0014s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0002s
-- execute("ALTER TABLE packages_rubygems_metadata\nADD CONSTRAINT check_ea02f4800f\nCHECK ( char_length(metadata) <= 30000 )\nNOT VALID;\n")
   -> 0.0012s
-- current_schema()
   -> 0.0002s
-- execute("SET statement_timeout TO 0")
   -> 0.0006s
-- execute("ALTER TABLE packages_rubygems_metadata VALIDATE CONSTRAINT check_ea02f4800f;")
   -> 0.0015s
-- execute("RESET ALL")
   -> 0.0009s
== 20210223230600 UpdateRubygemsMetadataMetadata: migrated (0.0197s) ==========

💻 How to test this feature?

Click to expand 1. Pull the branch, run the migration 2. In a rails console, turn on the feature flag:
Feature.enable(:rubygem_packages)
  1. Create a new gem:

a. First create a folder where you will hold your gem:

mkdir my-test-gem && cd my-test-gem

b. Create an app structure. The file itself can be blank for testing, or feel free to add some content if you'd like:

mkdir lib
touch lib/my_test_gem.rb
``

c. Create a gemspec file:

touch my-test-gem.gemspec vim my-test-gem.gemspec


d. Inside of the gemspec file:

```ruby
Gem::Specification.new do |s|
  s.name = 'my-test-gem'
  s.author = 'GitLab Tanuki'
  s.version = '0.0.1'
  s.summary = 'My test ruby gem'
  s.files = ['lib/my_test_gem.rb']
  s.require_paths = ['lib']

  s.description = 'A test package for GitLab.'
  s.email = 'tanuki@not_real.com'

  s.add_dependency 'dependency_1', '~> 1.2.3'
end

e. Build your package:

gem build my-test-gem.gemspec

f. You should now see my-test-gem-0.0.1.gem in your directory.

g. Get a personal access token with API scope, then in ~/.gem/credentials add:

---
http://localhost:3000/api/v4/projects/59/packages/rubygems: '309hwiofndfsenfose'

Replacing the host, project ID, and PAT with the values relevant to your local setup.

h. Now you can push the gem. From the my-test-gem/ directory:

gem push my-test-gem-0.0.1.gem --host http://gdk.test:3000/api/v4/projects/59/packages/rubygems

i. You should see a 201 created message. Navigate to your project in the UI, visit the Packages & Registries => Package Registry tab in the project, and you should see your package.

j. You can also see the rest of the data via the rails console:

pry(main)> gem = Packages::Package.last
pry(main)> gem.package_files
pry(main)> gem.dependency_links
pry(main)> gem.rubygems_metadatum

Troubleshooting

Make sure background jobs (sidekiq) is running and processing jobs. It may take a minute or two to see the package appear in the UI.

📸 Screenshots (strongly suggested)

→ gem push package-0.0.3.gem --host http://gdk.test:3001/api/v4/projects/59/packages/rubygems
Pushing gem to http://gdk.test:3001/api/v4/projects/59/packages/rubygems...
{"message":"201 Created"}
Package registry index view Screen_Shot_2021-03-01_at_1.12.02_PM
Package detail view Screen_Shot_2021-03-01_at_1.11.23_PM

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • [-] Label as security and @ mention @gitlab-com/gl-security/appsec
  • [-] The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • [-] Security reports checked/validated by a reviewer from the AppSec team

Related to #301175 (closed)

Edited by Steve Abrams

Merge request reports