Skip to content

Update lease key used by GitHubImport::UserFinder

Rodrigo Tomonari requested to merge rodrigo/update-user-finder-lease-key into master

What does this MR do and why?

Updates UserFinder to allow multiple jobs to request users' details simultaneously from GitHub API for distinct users.

This is a new approach that differs from the previous method used by !140051 (merged), where the UserFinder was limited to a single call throughout the entire migration process.

Also, the lease lock has been updated to hold a maximum of 6 seconds, enough time for the GitHub API to return the user details.

Related to: #435340 (closed)

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Screenshots or screen recordings

Screenshots are required for UI changes, and strongly recommended for all other merge requests.

Before After

How to set up and validate locally

  1. Enable the feature flag github_import_lock_user_finder
  2. Clear the log/importer.log.
echo -n > log/importer.log
  1. Delete UserFinder cache
Gitlab::Redis::SharedState.with do |redis|
  key_prefix = 'cache:gitlab:github-import/user-finder/'

  cursor = '0'
  begin
    cursor, keys = redis.scan(cursor, match: "#{key_prefix}*")

    keys.each do |key|
      redis.del(key)
      puts "Deleted key: #{key}"
    end
  end while cursor != '0'
end
  1. Restart Sidekiq
  2. Trigger a GitHub Import migration
curl --location 'http://gdk.test:3000/api/v4/import/github' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer ACCESS_TOKEN' \
--data '{
    "personal_access_token": "GITHUB_ACCESS_TOKEN",
    "repo_id": "8514", 
    "target_namespace": "root",
    "new_name": "Rails",
    "optional_stages": {
      "single_endpoint_issue_events_import": true,
      "single_endpoint_notes_import": true,
      "attachments_import": false,
      "collaborators_import": false
    },
    "timeout_strategy": "optimistic"
}'
  1. Wait for some records to be migrated
  2. Use the command below to count how many times each user's email was fetched. We should only see the username only once
grep '"Fetching email from GitHub"' log/importer.log | jq .username | sort | uniq -c | sort -g

Note: you can use the command below to cancel the import if you don't want it to finish.

curl --request POST \
  --url "http://gdk.test:3000/api/v4/import/github/cancel" \
  --header "content-type: application/json" \
  --header "Authorization: Bearer ACCESS_TOKEN" \
  --data '{
    "project_id": 12345
}'
Edited by Rodrigo Tomonari

Merge request reports