Skip to content

Add Group relations export API

George Koltsov requested to merge georgekoltsov/bulk_import_group_exports into master

What does this MR do?

Background information on the need for this change: &5769

This MR adds Group relations export API. It is similar to Group Export (https://docs.gitlab.com/ee/user/group/settings/import_export.html) with a few differences:

  1. Each top level relation is exported in a separate sidekiq worker, compressed and uploaded to Object Storage separately. This way export is distributed across multiple workers, occupying individual workers for less amount of time. Additionally, this brings down total size of the file significantly.
  2. Each top level relation is exported to .ndjson file, compressed and stored with carrierwave
  3. Each relation has a status API to be able to view it's progress

This functionality is added in order to enable Bulk Import (https://docs.gitlab.com/ee/user/group/import/) group migration. To import group/subgroup structures with one click, instead of having user to migrate groups one by one by dealing with archive files.

It requires a few new models to support the process:

1.BulkImports::Export to track individual top level relation export status

  1. BulkImports::ExportUpload to store exported gzip in ObjectStorage and allow it to be downloaded

The intention is to add the same for projects in the follow up MR.

Sequence diagram:

image

Mentions #328216 (closed)

Screenshots (strongly suggested)

screencast_2021-04-19_15-30-32

Migrations output

  • db/migrate/20210414100914_add_bulk_import_exports_table.rb
Up
== 20210414100914 AddBulkImportExportsTable: migrating ========================
-- create_table(:bulk_import_exports)
   -> 0.0060s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0002s
-- execute("ALTER TABLE bulk_import_exports\nADD CONSTRAINT check_24cb010672\nCHECK ( char_length(relation) <= 255 )\nNOT VALID;\n")
   -> 0.0009s
-- current_schema()
   -> 0.0002s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- execute("ALTER TABLE bulk_import_exports VALIDATE CONSTRAINT check_24cb010672;")
   -> 0.0009s
-- execute("RESET ALL")
   -> 0.0006s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0002s
-- execute("ALTER TABLE bulk_import_exports\nADD CONSTRAINT check_9ee6d14d33\nCHECK ( char_length(jid) <= 255 )\nNOT VALID;\n")
   -> 0.0008s
-- current_schema()
   -> 0.0002s
-- execute("ALTER TABLE bulk_import_exports VALIDATE CONSTRAINT check_9ee6d14d33;")
   -> 0.0009s
== 20210414100914 AddBulkImportExportsTable: migrated (0.0269s) ===============
Down
== 20210414100914 AddBulkImportExportsTable: reverting ========================
-- drop_table(:bulk_import_exports)
   -> 0.0044s
== 20210414100914 AddBulkImportExportsTable: reverted (0.0045s) ===============
  • db/migrate/20210414130017_add_foreign_key_to_bulk_import_exports_on_project.rb
Up
== 20210414130017 AddForeignKeyToBulkImportExportsOnProject: migrating ========
-- transaction_open?()
   -> 0.0000s
-- foreign_keys(:bulk_import_exports)
   -> 0.0037s
-- execute("ALTER TABLE bulk_import_exports\nADD CONSTRAINT fk_39c726d3b5\nFOREIGN KEY (project_id)\nREFERENCES projects (id)\nON DELETE CASCADE\nNOT VALID;\n")
   -> 0.0045s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- execute("ALTER TABLE bulk_import_exports VALIDATE CONSTRAINT fk_39c726d3b5;")
   -> 0.0045s
-- execute("RESET ALL")
   -> 0.0012s
== 20210414130017 AddForeignKeyToBulkImportExportsOnProject: migrated (0.0238s) 
Down
== 20210414130017 AddForeignKeyToBulkImportExportsOnProject: reverting ========
-- remove_foreign_key(:bulk_import_exports, {:column=>:project_id})
   -> 0.0043s
== 20210414130017 AddForeignKeyToBulkImportExportsOnProject: reverted (0.0086s)
  • db/migrate/20210414130526_add_foreign_key_to_bulk_import_exports_on_group.rb
Up
== 20210414130526 AddForeignKeyToBulkImportExportsOnGroup: migrating ==========
-- transaction_open?()
   -> 0.0000s
-- foreign_keys(:bulk_import_exports)
   -> 0.0033s
-- execute("ALTER TABLE bulk_import_exports\nADD CONSTRAINT fk_8c6f33cebe\nFOREIGN KEY (group_id)\nREFERENCES namespaces (id)\nON DELETE CASCADE\nNOT VALID;\n")
   -> 0.0025s
-- execute("SET statement_timeout TO 0")
   -> 0.0006s
-- execute("ALTER TABLE bulk_import_exports VALIDATE CONSTRAINT fk_8c6f33cebe;")
   -> 0.0094s
-- execute("RESET ALL")
   -> 0.0008s
== 20210414130526 AddForeignKeyToBulkImportExportsOnGroup: migrated (0.0231s) =
Down
== 20210414130526 AddForeignKeyToBulkImportExportsOnGroup: reverting ==========
-- remove_foreign_key(:bulk_import_exports, {:column=>:group_id})
   -> 0.0034s
== 20210414130526 AddForeignKeyToBulkImportExportsOnGroup: reverted (0.0076s) =
  • db/migrate/20210414131807_add_bulk_import_exports_table_indexes.rb
Up
== 20210414131807 AddBulkImportExportsTableIndexes: migrating =================
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:bulk_import_exports, [:group_id, :relation], {:unique=>true, :where=>"group_id IS NOT NULL", :name=>"partial_index_bulk_import_exports_on_group_id_and_relation", :algorithm=>:concurrently})
   -> 0.0017s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- add_index(:bulk_import_exports, [:group_id, :relation], {:unique=>true, :where=>"group_id IS NOT NULL", :name=>"partial_index_bulk_import_exports_on_group_id_and_relation", :algorithm=>:concurrently})
   -> 0.0073s
-- execute("RESET ALL")
   -> 0.0006s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:bulk_import_exports, [:project_id, :relation], {:unique=>true, :where=>"project_id IS NOT NULL", :name=>"partial_index_bulk_import_exports_on_project_id_and_relation", :algorithm=>:concurrently})
   -> 0.0013s
-- add_index(:bulk_import_exports, [:project_id, :relation], {:unique=>true, :where=>"project_id IS NOT NULL", :name=>"partial_index_bulk_import_exports_on_project_id_and_relation", :algorithm=>:concurrently})
   -> 0.0031s
== 20210414131807 AddBulkImportExportsTableIndexes: migrated (0.0164s) ========
Down
== 20210414131807 AddBulkImportExportsTableIndexes: reverting =================
-- transaction_open?()
   -> 0.0000s
-- indexes(:bulk_import_exports)
   -> 0.0030s
-- execute("SET statement_timeout TO 0")
   -> 0.0008s
-- remove_index(:bulk_import_exports, {:algorithm=>:concurrently, :name=>"partial_index_bulk_import_exports_on_group_id_and_relation"})
   -> 0.0038s
-- execute("RESET ALL")
   -> 0.0010s
-- transaction_open?()
   -> 0.0000s
-- indexes(:bulk_import_exports)
   -> 0.0013s
-- remove_index(:bulk_import_exports, {:algorithm=>:concurrently, :name=>"partial_index_bulk_import_exports_on_project_id_and_relation"})
   -> 0.0028s
== 20210414131807 AddBulkImportExportsTableIndexes: reverted (0.0153s) ========
  • db/migrate/20210414133310_add_bulk_import_export_uploads_table.rb
Up
== 20210414133310 AddBulkImportExportUploadsTable: migrating ==================
-- create_table(:bulk_import_export_uploads)
   -> 0.0072s
-- transaction_open?()
   -> 0.0000s
-- current_schema()
   -> 0.0003s
-- execute("ALTER TABLE bulk_import_export_uploads\nADD CONSTRAINT check_5add76239d\nCHECK ( char_length(export_file) <= 255 )\nNOT VALID;\n")
   -> 0.0009s
-- current_schema()
   -> 0.0002s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- execute("ALTER TABLE bulk_import_export_uploads VALIDATE CONSTRAINT check_5add76239d;")
   -> 0.0008s
-- execute("RESET ALL")
   -> 0.0005s
== 20210414133310 AddBulkImportExportUploadsTable: migrated (0.0187s) =========
Down
== 20210414133310 AddBulkImportExportUploadsTable: reverting ==================
-- drop_table(:bulk_import_export_uploads)
   -> 0.0048s
== 20210414133310 AddBulkImportExportUploadsTable: reverted (0.0048s) =========
  • db/migrate/20210419085714_add_foreign_key_to_bulk_import_export_uploads_on_export.rb
Up
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: reverting ===
-- remove_foreign_key(:bulk_import_export_uploads, {:column=>:export_id})
   -> 0.0051s
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: reverted (0.0109s) 

➜  gitlab git:(georgekoltsov/bulk_import_group_exports) ✗ mig
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: migrating ===
-- transaction_open?()
   -> 0.0000s
-- foreign_keys(:bulk_import_export_uploads)
   -> 0.0030s
-- execute("ALTER TABLE bulk_import_export_uploads\nADD CONSTRAINT fk_dfbfb45eca\nFOREIGN KEY (export_id)\nREFERENCES bulk_import_exports (id)\nON DELETE CASCADE\nNOT VALID;\n")
   -> 0.0016s
-- execute("SET statement_timeout TO 0")
   -> 0.0005s
-- execute("ALTER TABLE bulk_import_export_uploads VALIDATE CONSTRAINT fk_dfbfb45eca;")
   -> 0.0022s
-- execute("RESET ALL")
   -> 0.0007s
-- transaction_open?()
   -> 0.0000s
-- index_exists?(:bulk_import_export_uploads, :export_id, {:name=>"index_bulk_import_export_uploads_on_export_id", :algorithm=>:concurrently})
   -> 0.0014s
-- add_index(:bulk_import_export_uploads, :export_id, {:name=>"index_bulk_import_export_uploads_on_export_id", :algorithm=>:concurrently})
   -> 0.0041s
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: migrated (0.0216s)
Down
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: reverting ===
-- remove_foreign_key(:bulk_import_export_uploads, {:column=>:export_id})
   -> 0.0041s
-- transaction_open?()
   -> 0.0000s
-- indexes(:bulk_import_export_uploads)
   -> 0.0019s
-- execute("SET statement_timeout TO 0")
   -> 0.0004s
-- remove_index(:bulk_import_export_uploads, {:algorithm=>:concurrently, :name=>"index_bulk_import_export_uploads_on_export_id"})
   -> 0.0017s
-- execute("RESET ALL")
   -> 0.0005s
== 20210419085714 AddForeignKeyToBulkImportExportUploadsOnExport: reverted (0.0149s) 

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

If this MR contains changes to processing or storing of credentials or tokens, authorization and authentication methods and other items described in the security review guidelines:

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by George Koltsov

Merge request reports