Skip to content

Create packages_terraform_module_metadata table & corresponding model

What does this MR do and why?

This is the 1st MR of the needed work to support Provide module documentation in the Terraform r... (&12408).

When a new terraform module is published in the Terraform Registry, we send the module archive to the Object Storage and that's it. Now we need to display some information in the UI page of the module. We need to extract that information from the files embedded in the uploaded module archive to do so.

An example of a terraform module file structure can be as follows:

$ tree complete-module/
.
├── README.md
├── main.tf
├── variables.tf
├── outputs.tf
├── ...
├── modules/
│   ├── nestedA/
│   │   ├── README.md
│   │   ├── variables.tf
│   │   ├── main.tf
│   │   ├── outputs.tf
│   ├── nestedB/
│   ├── .../
├── examples/
│   ├── exampleA/
│   │   ├── main.tf
│   ├── exampleB/
│   ├── .../

We can notice three categories:

  • The root module: consists of the files in the root of the module (README.md, main.tf, variables.tf, outputs.tf ..etc)
  • The modules subdirectory: contains the nested submodules. Each submodule consists of the same file structure as the root module.
  • The examples subdirectory: contains examples of how to use the module.

When we look into how the Hashicorp Terraform registry displays the module's documentation data, we can see that all above three categories are represented.

So our goal is to collect the metadata of the root module, the submodules & the examples and display them in the UI.

To do so, the 1st thing we need is persisting the metadata in the database, and in this MR, we create the database table and the corresponding rails model.

The table structure is simple:

  • packages_id column: Each package has one row in the metadata table. That's why the packages_id is the primary key for the table.

  • project_id column: This is now required for all cell-local tables.

  • metadata column: jsonb column to host all extracted metadata. I chose this approach over having multiple columns for every type of metadata for the following reasons:

    • Having multiple columns for each metadatum means we need to create a separate metadata record for each category in the module. One record for the root module, and several records for the submodules & examples. This design would complicate the table because we would need to distinguish between each category, so we need something like an enum column to hold the category value(root, submodule, example) and another column for the name of the category. Some unique indexes would be required to ensure we don't have unneeded duplicate records.

    • The three categories don't have the same metadata structure: the root module & submodules usually have Readme, Inputs, Outputs, Dependencies & Resources, while examples only have Readme, Inputs & Outputs. That means we will always have empty fields for Dependencies & Resources in the case of the examples rows.

    • Fetching/inserting just one row from/in the database is better than fetching/inserting multiple metadata rows.

    • The metadata are displayed on the UI as a whole so it's unlikely that we would need some specific fields from it.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. In rails console try to create a new Packages::TerraformModule::Metadatum database record:
    # stub file upload
    def fixture_file_upload(*args, **kwargs)
      Rack::Test::UploadedFile.new(*args, **kwargs)
    end
    
    FactoryBot.create(:terraform_module_metadatum)
  2. It should be created successfully.

Related to #438056 (closed)

Edited by Moaz Khalifa

Merge request reports