Skip to content

POC: GitLab Migration import/export relations in batches

What does this MR do and why?

This MR is a proof of concept to showcase GitLab groups/projects direct transfer (aka GitLab Migration/Bulk Imports) in batches. It showcases group labels import only.

Key changes:

On export side:

  1. New model ExportBatch to track state and store exported relation batch (1000 records per batch)
  2. Export now has many batches (BulkImports::ExportBatch which contain 1k exported records of a particular relation
  3. BatchedRelationExportService -- overarching export service that initiated export and enqueues batch export jobs
  4. RelationBatchExportWorker -- a new worker that executes batch export service
  5. RelationBatchExportService -- a service which performs batch export
  6. FinishBatchedRelationExportWorker -- a worker that keeps track of all batches and updates overall export status to finished
  7. Updated Gitlab::ImportExport::Json::StreamingSerializer to add serialization of a batch of records
  8. Updates to relation exports API to include info on batches readiness

On import side:

  1. New model BatchTracker to track state of relation batch import
  2. Tracker now has many batches as well (BulkImports::BatchTracker)
  3. New PipelineBatchWorker to process a batch of records by doing the same thing as PipelineWorker used to do
  4. FinishBatchedPipelineWorker to update state of a pipeline tracker once all batches are complete

A lot of database changes to support changes above.

Current state of the MR is enough to showcase the solution and to decide whether or not to continue with this approach.

Mentions #382121 (closed)

Screenshots or screen recordings

This demo is not perfect as the import is not fully finished and there are validation errors but it shows the idea.

5k-labels-batch-demo.mov

Next Steps

If approach looks good, I think the order of implementation should be:

  1. Database migrations
  2. Batched export so that source instances start having an ability to export in batches
  3. Batched import, behind a feature flag

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by George Koltsov

Merge request reports