Skip to content

GitHub Importer: Option to fetch comments 1 at a time

Problem to solve

The MRs usually get imported, but they lack a lot of metadata, specifically the comments and diffs. The comments and diffs are inconsistent. Thes example is still occurring after the patches are applied.

The last test I ran showed 155/285 comments for the large pull request I am using to validate the changes.

Aftermath

After more tests and investigations, this error was categorized as intermittent, probably something related to networking/infrastructure.

Also, we recently noticed a possible cause for missing data in the Github importer (#333246 (closed)). I'll (@kassio) focus on fix that next, which might be the reason of the missing diff note comments.

Proposed solution

Per this note:

  • Implement fetching issue comments and MR comments 1 at a time, instead of page by page.
  • Put this change behind a feature flag that is disabled by default.
  • Document this solution with the recommended number of issues/MR above which one should consider using this method.

Out of scope

Following features/ideas were discussed, but are not included in this issue:

  • Rewriting GitHub Importer to use GraphQL. - not all the needed resources are available in GitHub's GraphQL
  • A hybrid solution that utilizes both approaches (paging and fetching 1 at a time), as needed. - too complex
  • A solution that analyzes the data and recommends an approach. - a potential future enhancement
Edited by Haris Delalić