Parse and create terraform module metadata
Context
This MR is the 3rd one after !141914 (merged) & !145020 (merged) in the effort of implementing Provide module documentation in the Terraform r... (&12408).
At a high level, the implementation of &12408 can be broken down into the following steps:
- Creating a new database table to persist the extracted documentation data. (Done in !141914 (merged))
- Introduce a background worker that's responsible for triggering a service (or multiple services) to index the terraform module file. Indexing consists of several actions:
- Extraction of the readme file for the module, its submodules, and its examples.
- Processing of other files to generate additional documentation data such as
inputs
,outputs
,dependencies
&resources
. - Persistent storage of the readme files and other extracted data in the newly created database table.
- Integrate the persisted documentation data into the response sent to the frontend.
- Implement the necessary adjustments in the UI to appropriately display the documentation.
This MR is implementing the 2nd part of step 2
: Parsing the needed metadata from terraform module archives and then creating the metadata records in the database.
- Here's a demonstration of the flow:
%%{init: {"flowchart": {"htmlLabels": false}} }%%
flowchart TB
subgraph "Done In https://gitlab.com/gitlab-org/gitlab/-/merge_requests/145020"
a("Packages::TerraformModule::CreatePackageService") -- "`if **index_terraform_module_archive** feature flag enabled`" --> b("Packages::TerraformModule::ProcessPackageFileWorker")
b("Packages::TerraformModule::ProcessPackageFileWorker") --> c("Packages::TerraformModule::ProcessPackageFileService")
c("Packages::TerraformModule::ProcessPackageFileService") --> d("Packages::TerraformModule::Metadata::ExtractFilesService") -- "`parsed metadata`" --> c("Packages::TerraformModule::ProcessPackageFileService")
end
subgraph "In this MR"
d("Packages::TerraformModule::Metadata::ExtractFilesService") --> e("Packages::TerraformModule::Metadata::ParseFileService") -- "`parsed metadata`" --> d("Packages::TerraformModule::Metadata::ExtractFilesService")
c("Packages::TerraformModule::ProcessPackageFileService") -- "`parsed metadata`" --> f("Packages::TerraformModule::Metadata::CreateServie")
end
What does this MR do and why?
-
Creates 2 new services:
Packages::TerraformModule::Metadata::ParseFileService
&Packages::TerraformModule::Metadata::CreateService
:-
Packages::TerraformModule::Metadata::ParseFileService
should receive a single terraform file. It could be aREADME
file or a.tf
file. According to the parsing logic in the service, the file is read line by line (more efficient memory-wise than reading the whole file at once), and the needed metadata are extracted, formatted and returned to thePackages::TerraformModule::Metadata::ExtractFilesService
, which in turn aggregates the metadata extracted from all files into one hash, and then returns this metadata hash to thePackages::TerraformModule::ProcessPackageFileService
. -
The
Packages::TerraformModule::ProcessPackageFileService
sends the metadata hash to thePackages::TerraformModule::Metadata::CreateService
to persist thePackages::TerraformModule::Metadatum
record in the database.
-
-
Adds the needed specs.
-
The whole feature is gated behind the
index_terraform_module_archive
feature flag. -
The logic in the
Packages::TerraformModule::Metadata::ParseFileService
is a bit complex because all.tf
files are written in HCL (HashiCorp configuration language), and we don't have ready-to-use solutions in ruby to parse it. So, I tried to make the parsing script as accurate as possible. I tested it against a wide range of public terraform modules, and it proved to be working fine. -
The MR is kinda chunky (sorry about that
🙏 ), but that's mostly because of the JSON files I had to add to support testing & validating the metadata extracted from the terraform modules.
MR acceptance checklist
Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Screenshots or screen recordings
N/A
How to set up and validate locally
- Enable the feature flag in rails console:
Feature.enable(:index_terraform_module_archive)
-
On HashiCorp Terraform registry, choose any module to test with, then go to its source code on GitHub and download it as a zip file.
-
From your terminal, use cURL to publish the module you just downloaded: https://docs.gitlab.com/ee/user/packages/terraform_module_registry/#using-the-api
curl --location --header "PRIVATE-TOKEN: <your_PAT>" --upload-file <module_zip_file> "http://gdk.test:3000/api/v4/projects/<project_id>/packages/terraform/modules/my-module/my-system/0.0.1/file"
- After publishing the module successfully, check its newly created metadata in rails console:
Packages::TerraformModule::Metadatum.last.fields
- To verify that the metadata was extracted correctly, you can compare what you get from
Packages::TerraformModule::Metadatum.last.fields
with the displayed metadata for the module on its HashiCorp Terraform registry page:
- As you see in the screenshot, this is a module page on HashiCorp Terraform registry. Its metadata is displayed under
readme
,outputs
,inputs
,dependency
,resources
tabs. Also, there are similar metadata for each submodule & example in the above dropdown menus.Packages::TerraformModule::Metadatum.last.fields
should return the same data with the same structure you would see under those tabs.
Related to #438058