Handle 429s during github LFS import

What does this MR do and why?

This MR updates GitHub Importer's LFS stage and adds rate limit handling. In an even when the importer is being rate limited by github's secondary rate limit and every endpoint returns 429 we should re-enqueue the job and retry later, instead of hard failing the import. Similar to what's happening in other import stages (like issues, mrs, etc). Now when 429 is encountered during LFS import - we reschedule the job and try again in 2 minutes.

Retrying logs during rate limit objects list fetching
{
  "message": "starting stage",
  "import_stage": "Gitlab::GithubImport::Stage::ImportLfsObjectsWorker"
}
{
  "message": "starting importer",
  "importer": "Gitlab::GithubImport::Importer::LfsObjectImporter",
}
{
  "source": "Gitlab::GithubImport::Importer::LfsObjectsImporter",
  "message": "importer failed",
  "exception.message": "Rate Limit exceeded"
}
{
  "message": "stage retrying",
  "exception_class": "Gitlab::GithubImport::RateLimitError",
  "import_stage": "Gitlab::GithubImport::Stage::ImportLfsObjectsWorker"
}
{
  "message": "starting stage",
  "import_stage": "Gitlab::GithubImport::Stage::ImportLfsObjectsWorker"
}
{
  "message": "starting importer",
  "importer": "Gitlab::GithubImport::Importer::LfsObjectImporter",
}
{
  "source": "Gitlab::GithubImport::Importer::LfsObjectsImporter",
  "message": "importer failed",
  "exception.message": "Rate Limit exceeded"
}
{
  "message": "stage retrying",
  "exception_class": "Gitlab::GithubImport::RateLimitError",
  "import_stage": "Gitlab::GithubImport::Stage::ImportLfsObjectsWorker"
}
Retrying logs during individual LFS Object download

{
  "message": "starting importer",
  "project_id": 107,
  "importer": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440
  }
}
{
  "source": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440,
    "object_type": "lfs_object"
  },
  "message": "importer failed",
  "exception.message": "Rate Limit exceeded"
}
{
  "message": "starting importer",
  "project_id": 107,
  "importer": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440
  }
}
{
  "project_id": 107,
  "source": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440,
    "object_type": "lfs_object"
  },
  "message": "importer failed",
  "exception.message": "Rate Limit exceeded"
}
{
  "message": "starting importer",
  "project_id": 107,
  "importer": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440
  }
}
{
  "source": "Gitlab::GithubImport::Importer::LfsObjectImporter",
  "external_identifiers": {
    "oid": "23543ccc1eb8277355bc2efd49bf9ffd1a97cbbad2beac2e40d4d3176f7b0dd2",
    "size": 48440,
    "object_type": "lfs_object"
  },
  "message": "importer failed",
  "exception.message": "Rate Limit exceeded"
}

References

#582211 (closed)

Screenshots or screen recordings

Before After

How to set up and validate locally

  1. Create repo on github with lfs

To simulate/test lfs object list download rate limit, change this line:

diff --git a/app/services/projects/lfs_pointers/lfs_download_link_list_service.rb b/app/services/projects/lfs_pointers/lfs_download_link_list_service.rb
index 80fdeeb74329..f075ee053a0b 100644
--- a/app/services/projects/lfs_pointers/lfs_download_link_list_service.rb
+++ b/app/services/projects/lfs_pointers/lfs_download_link_list_service.rb
@@ -56,7 +56,7 @@ def download_links_in_batches(oids, batch_size = REQUEST_BATCH_SIZE, &block)
       def download_links_for(oids)
         response = ::Import::Clients::HTTP.post(remote_uri, body: request_body(oids), headers: headers)
 
-        raise DownloadLinksRequestTooManyRequestsError if response.too_many_requests?
+        raise DownloadLinksRequestTooManyRequestsError if true
         raise DownloadLinksRequestEntityTooLargeError if response.request_entity_too_large?
         raise DownloadLinksError, response.message unless response.success?
  1. Start new import from github and observe the logs
  2. Verify LFS stage is not failing the import but requeues itself every 2 minutes

To simulate/test individual lfs object download, change this line:

diff --git a/lib/gitlab/github_import/importer/lfs_object_importer.rb b/lib/gitlab/github_import/importer/lfs_object_importer.rb
index 09231e28d342..7490dcedfd27 100644
--- a/lib/gitlab/github_import/importer/lfs_object_importer.rb
+++ b/lib/gitlab/github_import/importer/lfs_object_importer.rb
@@ -23,7 +23,7 @@ def lfs_download_object
         def execute
           result = Projects::LfsPointers::LfsDownloadService.new(project, lfs_download_object).execute
 
-          if result[:status] == :error && result[:message]&.include?('Received error code 429')
+          if true
             raise Gitlab::GithubImport::RateLimitError.new('Rate Limit exceeded', reset_in: RETRY_DELAY)
           end
  1. Start new import from github and observer the logs

MR acceptance checklist

Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by George Koltsov

Merge request reports

Loading