Skip to content

Make BitBucket importer parallel

Madelein van Niekerk requested to merge 412614-parallel-bitbucket-cloud into master

What does this MR do and why?

Makes the BitBucket Cloud importer parallel instead of running sequentially in order to fix migration timeouts. This follows the same framework as the GitHub importer and the BitBucket Server importer.

To keep changes smallish, the work is split into:

  • Import repository and wiki and mark the import as complete (feature flagged) 👈 this MR
  • MR(s) to import the remaining objects: pull requests, issues, etc.
  • Refactoring MR(s) for reusing shared code between BitBucket Cloud and BitBucket Server

The feature is behind bitbucket_parallel_importer feature flag so that it is safe to continue releasing bits without affecting the importer. The feature flag is controlled in lib/gitlab/import_sources.rb.

Important files to review

  • import_repository_worker.rb
  • repository_importer.rb
  • finish_import_worker.rb

The following files are mostly/entirely copies from BitBucket Server (and will be refactored in follow-ups):

  • stage_methods.rb
  • advance_stage_worker.rb
  • loggable.rb
  • parallel_importer.rb

How to set up and validate locally

  1. Follow https://docs.gitlab.com/ee/integration/bitbucket.html to setup OAuth for BitBucket Cloud. You will need an account on https://bitbucket.org/.
  2. Create a project, repo and a wiki on BitBucket.
  3. Enable the feature flag: Feature.enable(:bitbucket_parallel_importer)
  4. On your gdk server, create a new project > click on Import project > Bitbucket Cloud > follow instructions to connect to https://bitbucket.org/.
  5. Import the project created in step 2.
  6. Verify that the project was created and that the repo and wiki are imported correctly.
  7. You can also view the importer logs to see when each step was executed or if there are errors: tail -f log/importer.log

Demo

Screenshot_2023-09-04_at_10.06.03

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Related to #412614 (closed)

Edited by Madelein van Niekerk

Merge request reports