Add project export relation service to Import/Export
What does this MR do and why?
This change adds a worker and service to generate the project export files separately instead of generating them in a single operation like the Projects::ImportExport::ExportService.
The plan is to generate each part of the project export tarball in different processes and saves the partial files temporarily in the object-store. And finally, combine the partial files in a tarball with the same structure as the one generated by the Projects::ImportExport::ExportService.
Note: The worker and service aren't used yet. Following MRs will use them
Related to: #360684 (closed)
Highlights of the changes
Projects::ImportExport::RelationExportService
This service is responsible for selecting the correct "saver" to generate the relation export files, compress them and upload them to the object-store.
The relation export files are exported to the temporary export_path (e.g. /path/to/shared/tmp/project_exports/namespace/project/:randomA/:randomB), then all the content of the :randomB folder is compressed in a tar.gz in the archive_path. Finally the compressed tar.gz is uploaded to the object-store.
What is exported for each relation varies depending on each "saver". Below there are some examples of the exported content for each relation. The files and folder structure are the same ones generated by the ProjectExportService as the same savers are being used. The only saver that is new is the Gitlab::ImportExport::Project::RelationSaver that only exports one project tree relation node at a time.
E.G
lfs-objects
| randomB
|-lfs-objects
| |-2cb698f3a725e42800172f662a10de64e26fbc4425ad871609472add43c77ffc
| |-a804a6aec96c1f2a7db1c5e22c658405343272da1069ae26ce34a3cc39d83130
| |-89dc431c9e6e503c46b6b7285f2f5542b03e46f4dd1e11cd73cf10c812f7321c
| |-d262f804319ceb22ec80430141b46dba3f57c4b9f87afa65b3fea377ede7e76e
labels
| randomB
|-project
| |-labels.ndjson
project
| randomB
|-project.json
repository
| randomB
|-project.bundle
Projects::ImportExport::RelationExportWorker
The worker uses the same urgency: low and worker_resource_boundary: memory as the ProjectExportWorker
Gitlab::ImportExport::Project::RelationSaver
This new saver is similar to the Gitlab::ImportExport::Project::TreeSaver, but instead of generating the files for all project tree, it only exports one node of the tree at a time.
Screenshots or screen recordings
These are strongly recommended to assist reviewers and reduce the time to merge your change.
How to set up and validate locally
Because the worker and service aren't hooked to anything, Rails console needs to be used
In a Rails console run the commands below to start a relation export worker for each relation
project = Project.first # pick a project to export
project_export_job = project.export_jobs.create(status: 0, jid: SecureRandom.hex(10))
Projects::ImportExport::RelationExport.relation_names_list.each do |relation|
relation_export = project_export_job.relation_exports.create(relation: relation)
Projects::ImportExport::RelationExportWorker.perform_async(relation_export.id)
end
The worker will create a record for each relation in the project_relation_exports table. It's expected that the status of all relations to be 2 (finished) when the worker completes exporting the relation.
For each exported relation a tar.gz will be created and placed in the local storage. Use the command below to list all the locations
project_export_job.relation_exports.with_status(:finished).map { |export| export.upload.export_file.url_or_file_path }
MR acceptance checklist
This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.
-
I have evaluated the MR acceptance checklist for this MR.