Skip to content

Add project export relation service to Import/Export

What does this MR do and why?

This change adds a worker and service to generate the project export files separately instead of generating them in a single operation like the Projects::ImportExport::ExportService.

The plan is to generate each part of the project export tarball in different processes and saves the partial files temporarily in the object-store. And finally, combine the partial files in a tarball with the same structure as the one generated by the Projects::ImportExport::ExportService.

Note: The worker and service aren't used yet. Following MRs will use them

Related to: #360684 (closed)

Highlights of the changes

Projects::ImportExport::RelationExportService

This service is responsible for selecting the correct "saver" to generate the relation export files, compress them and upload them to the object-store.

The relation export files are exported to the temporary export_path (e.g. /path/to/shared/tmp/project_exports/namespace/project/:randomA/:randomB), then all the content of the :randomB folder is compressed in a tar.gz in the archive_path. Finally the compressed tar.gz is uploaded to the object-store.

What is exported for each relation varies depending on each "saver". Below there are some examples of the exported content for each relation. The files and folder structure are the same ones generated by the ProjectExportService as the same savers are being used. The only saver that is new is the Gitlab::ImportExport::Project::RelationSaver that only exports one project tree relation node at a time.

E.G

lfs-objects

| randomB
 |-lfs-objects
 | |-2cb698f3a725e42800172f662a10de64e26fbc4425ad871609472add43c77ffc
 | |-a804a6aec96c1f2a7db1c5e22c658405343272da1069ae26ce34a3cc39d83130
 | |-89dc431c9e6e503c46b6b7285f2f5542b03e46f4dd1e11cd73cf10c812f7321c
 | |-d262f804319ceb22ec80430141b46dba3f57c4b9f87afa65b3fea377ede7e76e

labels

| randomB
 |-project
 | |-labels.ndjson

project

| randomB
 |-project.json

repository

| randomB
 |-project.bundle

Projects::ImportExport::RelationExportWorker

The worker uses the same urgency: low and worker_resource_boundary: memory as the ProjectExportWorker

Gitlab::ImportExport::Project::RelationSaver

This new saver is similar to the Gitlab::ImportExport::Project::TreeSaver, but instead of generating the files for all project tree, it only exports one node of the tree at a time.

Screenshots or screen recordings

These are strongly recommended to assist reviewers and reduce the time to merge your change.

How to set up and validate locally

Because the worker and service aren't hooked to anything, Rails console needs to be used

In a Rails console run the commands below to start a relation export worker for each relation

project = Project.first # pick a project to export

project_export_job = project.export_jobs.create(status: 0, jid: SecureRandom.hex(10))

Projects::ImportExport::RelationExport.relation_names_list.each do |relation|
  relation_export = project_export_job.relation_exports.create(relation: relation)
  Projects::ImportExport::RelationExportWorker.perform_async(relation_export.id)
end

The worker will create a record for each relation in the project_relation_exports table. It's expected that the status of all relations to be 2 (finished) when the worker completes exporting the relation.

For each exported relation a tar.gz will be created and placed in the local storage. Use the command below to list all the locations

project_export_job.relation_exports.with_status(:finished).map { |export| export.upload.export_file.url_or_file_path }

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Rémy Coutable

Merge request reports

Loading