Skip to content

Separate worker notes on bitbucket server

  • Please check this box if this contribution uses AI-generated content as outlined in the GitLab DCO & CLA

What does this MR do and why?

Separate worker for pull request notes on Bitbucket Server.

Previous (existing implementation)

Screenshot_2024-04-11_at_21.39.12

Currently, Importers::NotesImporter enqueues per MR, and then ImportPullRequestNotesWorker (Importers::PullRequestNotesImporter) makes HTTP calls (using a loop). This means there might be concurrent HTTP calls to Bitbucket Server from multiple MR.

New changes (this MR)

Screenshot_2024-04-11_at_21.34.47

Move HTTP Calls

This MR changes the behavior. So, Importers::NotesImporter still loops the MR and makes HTTP calls to retrieve activities. The enqueue to ImportPullRequestNotesWorker (Importers::PullRequestNotesImporter) is on the activity level.

This means we can easily control HTTP calls in Importers::NotesImporter.

Separate Importers Logic

This MR also separates Importers::PullRequestNotesImporter logic into several classes (still using the same worker ImportPullRequestNotesWorker):

  • Importers::PullRequestNotes::ApprovedEvent
  • Importers::PullRequestNotes::Inline
  • Importers::PullRequestNotes::MergeEvent
  • Importers::PullRequestNotes::StandalonePr

Add Paging on HTTP Calls

This MR adds Gitlab::Import::PageCounter so that HTTP calls on client.activities can resume from the last page upon interruption.

Notes

  1. This MR intention is more towards reducing concurrent HTTP calls to the Bitbucket Server. In terms of speed:
    • If the Bitbucket Server can handle "unlimited" requests, this MR will slow down the importing process. Reducing concurrent HTTP calls means less processing on the GitLab side, although we separate the processing into per activity level (network latency on remote server > GitLab DB latency).
    • If the Bitbucket Server has limited capacity, this MR should help lessen the burden on the Bitbucket Server.
  2. ObjectImporter in bitbucket_server_import has not handled conversion to the "representation" object yet. So, the 4 new classes created above were copy-pasting from the previous implementation Importers::PullRequestNotesImporter but changed the "object style" into "hash style".
  3. Haven't handled the Bitbucket (non Server) yet.
  4. With these changes, there is a risk that a note might reach the sidekiq byte size limit. Similar with this issue.

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After
MR list:
Screenshot_2024-04-17_at_00.30.19
Approve event:
Screenshot_2024-04-13_at_23.31.51
Standalone PR:
Screenshot_2024-04-13_at_23.32.27
Inline:
Screenshot_2024-04-13_at_23.32.36
Calls the new class:
Screenshot_2024-04-14_at_00.08.09
Resume from the last page upon interruption:
Screenshot_2024-04-14_at_21.20.46

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

  1. Initial environment setup can follow this
  2. Prepare some data in Bitbucket Server
  3. Patch the code to inject interruption:
diff --git a/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb b/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
index 99876c779539..06bd0c6b92cd 100644
--- a/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
+++ b/gems/gitlab-http/lib/gitlab/http_v2/url_blocker.rb
@@ -64,6 +64,9 @@ def validate_url_with_proxy!(
         )
           # rubocop:enable Metrics/ParameterLists
 
+          allow_localhost = true
+          allow_local_network = true
+
           return Result.new(nil, nil, true) if url.nil?
 
           raise ArgumentError, 'The schemes is a required argument' if schemes.blank?
diff --git a/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb b/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
index e7c1c91ae970..744a976975ed 100644
--- a/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
+++ b/lib/gitlab/bitbucket_server_import/importers/notes_importer.rb
@@ -22,6 +22,15 @@ def execute
                          "page #{page} using batch-size #{concurrent_import_jobs_limit}"
               )
 
+              if merge_request.iid == 2
+                Gitlab::Redis::SharedState.with do |redis|
+                  temp_key = 'test-bitbucket-pr'
+                  temp_counter = redis.incr(temp_key)
+                  redis.expire(temp_key, 5.minutes)
+                  raise "purposely interrupt" if temp_counter == 4
+                end
+              end
+
               activities = client.activities(
                 project_key, repository_slug, merge_request.iid,
                 page_offset: page, limit: concurrent_import_jobs_limit
@@ -131,6 +140,10 @@ def parent_collection
         def page_counter_id(merge_request)
           "merge_request/#{merge_request.id}/#{collection_method}"
         end
+
+        def concurrent_import_jobs_limit
+          2
+        end
       end
     end
   end
  1. Tail the log file: tail -f log/importer.log
  2. Go to http://127.0.0.1:3000/import/bitbucket_server/status
  3. Click Import
Edited by Ivan Sebastian

Merge request reports