Skip to content

Draft: Allow Import to check placeholder loading progress

What does this MR do and why?

!156704 introduced the batch loading of import placeholder references from Redis to PostgreSQL.

This change allows an import to check if there are values in the Redis set before finishing the import:

Import::PlaceholderReferences.pending('github', 1).any?

In a scenario where PlaceholderReferences::LoadService had been unable to manage to clear the Redis data, we would normally continue to wait for 24 hours checking .any? until the cache expired.

As it shouldn't take 24 hours to load and clear the data, an importer can check the first_checked_at time to decide if it should just finalise the import and cut its losses.

Perhaps only importers that support timeout_strategy: :optimistic should do this.

Here's an example of checking the progress, and retrying the finalising import worker again later. Queueing another LoadPlaceholderContributionsWorker may as well happen, as the worker is idempotent and it ensures we have a worker to progress the loading of data we're waiting for:

result = Import::PlaceholderReferences.pending('github', 1)

if result.any?
  if result.first_checked_at > 2.hours.ago
    Gitlab::Import::LoadPlaceholderContributionsWorker.perform_async('github', 1) 
    return retry_finalize_later
  else
    # log an error but continue to finalize the import
  end
end

#467511

MR acceptance checklist

Please evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.

How to set up and validate locally

Numbered steps to set up and validate the change are strongly suggested.

Related to #467511

Edited by Luke Duncalfe

Merge request reports