Parse and create terraform module metadata (!148569) · Merge requests · GitLab.org / GitLab

Moaz Khalifa requested to merge 438058-parse-and-save-terraform-module-metadata into master Apr 03, 2024

Context

This MR is the 3rd one after !141914 (merged) & !145020 (merged) in the effort of implementing Provide module documentation in the Terraform r... (&12408).

At a high level, the implementation of &12408 can be broken down into the following steps:

Creating a new database table to persist the extracted documentation data. (Done in !141914 (merged))
Introduce a background worker that's responsible for triggering a service (or multiple services) to index the terraform module file. Indexing consists of several actions:
- Extraction of the readme file for the module, its submodules, and its examples.
- Processing of other files to generate additional documentation data such as inputs, outputs, dependencies & resources.
- Persistent storage of the readme files and other extracted data in the newly created database table.
Integrate the persisted documentation data into the response sent to the frontend.
Implement the necessary adjustments in the UI to appropriately display the documentation.

This MR is implementing the 2nd part of step 2: Parsing the needed metadata from terraform module archives and then creating the metadata records in the database.

Here's a demonstration of the flow:

  %%{init: {"flowchart": {"htmlLabels": false}} }%%
  flowchart TB
    subgraph "Done In https://gitlab.com/gitlab-org/gitlab/-/merge_requests/145020"

    a("Packages::TerraformModule::CreatePackageService") -- "`if **index_terraform_module_archive** feature flag  enabled`" --> b("Packages::TerraformModule::ProcessPackageFileWorker")
    b("Packages::TerraformModule::ProcessPackageFileWorker") --> c("Packages::TerraformModule::ProcessPackageFileService")
    c("Packages::TerraformModule::ProcessPackageFileService") --> d("Packages::TerraformModule::Metadata::ExtractFilesService") -- "`parsed metadata`" --> c("Packages::TerraformModule::ProcessPackageFileService")
    end
    subgraph "In this MR"
    
    d("Packages::TerraformModule::Metadata::ExtractFilesService") --> e("Packages::TerraformModule::Metadata::ParseFileService") -- "`parsed metadata`" --> d("Packages::TerraformModule::Metadata::ExtractFilesService")
    c("Packages::TerraformModule::ProcessPackageFileService") -- "`parsed metadata`" --> f("Packages::TerraformModule::Metadata::CreateServie")
    end

What does this MR do and why?

Creates 2 new services: Packages::TerraformModule::Metadata::ParseFileService & Packages::TerraformModule::Metadata::CreateService:
- Packages::TerraformModule::Metadata::ParseFileService should receive a single terraform file. It could be a README file or a .tf file. According to the parsing logic in the service, the file is read line by line (more efficient memory-wise than reading the whole file at once), and the needed metadata are extracted, formatted and returned to the Packages::TerraformModule::Metadata::ExtractFilesService, which in turn aggregates the metadata extracted from all files into one hash, and then returns this metadata hash to the Packages::TerraformModule::ProcessPackageFileService.
- The Packages::TerraformModule::ProcessPackageFileService sends the metadata hash to the Packages::TerraformModule::Metadata::CreateService to persist the Packages::TerraformModule::Metadatum record in the database.
Adds the needed specs.
The whole feature is gated behind the index_terraform_module_archive feature flag.
The logic in the Packages::TerraformModule::Metadata::ParseFileService is a bit complex because all .tf files are written in HCL (HashiCorp configuration language), and we don't have ready-to-use solutions in ruby to parse it. So, I tried to make the parsing script as accurate as possible. I tested it against a wide range of public terraform modules, and it proved to be working fine.
The MR is kinda chunky (sorry about that 🙏), but that's mostly because of the JSON files I had to add to support testing & validating the metadata extracted from the terraform modules.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

N/A

How to set up and validate locally

Enable the feature flag in rails console:

Feature.enable(:index_terraform_module_archive)

On HashiCorp Terraform registry, choose any module to test with, then go to its source code on GitHub and download it as a zip file.
From your terminal, use cURL to publish the module you just downloaded: https://docs.gitlab.com/ee/user/packages/terraform_module_registry/#using-the-api

curl --location --header "PRIVATE-TOKEN: <your_PAT>" --upload-file <module_zip_file> "http://gdk.test:3000/api/v4/projects/<project_id>/packages/terraform/modules/my-module/my-system/0.0.1/file"

After publishing the module successfully, check its newly created metadata in rails console:

Packages::TerraformModule::Metadatum.last.fields

To verify that the metadata was extracted correctly, you can compare what you get from Packages::TerraformModule::Metadatum.last.fields with the displayed metadata for the module on its HashiCorp Terraform registry page:

As you see in the screenshot, this is a module page on HashiCorp Terraform registry. Its metadata is displayed under readme, outputs, inputs, dependency, resources tabs. Also, there are similar metadata for each submodule & example in the above dropdown menus. Packages::TerraformModule::Metadatum.last.fields should return the same data with the same structure you would see under those tabs.

Related to #438058 (closed)

Edited Apr 04, 2024 by Moaz Khalifa

Parse and create terraform module metadata

Context

What does this MR do and why?

MR acceptance checklist

Screenshots or screen recordings

How to set up and validate locally

Merge request reports