GitHub Importer: Option to fetch comments 1 at a time
Problem to solve
The MRs usually get imported, but they lack a lot of metadata, specifically the comments and diffs. The comments and diffs are inconsistent. Thes example is still occurring after the patches are applied.
The last test I ran showed 155/285 comments for the large pull request I am using to validate the changes.
Aftermath
After more tests and investigations, this error was categorized as intermittent, probably something related to networking/infrastructure.
Also, we recently noticed a possible cause for missing data in the Github importer (#333246 (closed)). I'll (@kassio) focus on fix that next, which might be the reason of the missing diff note comments.
Proposed solution
Per this note:
-
Implement fetching issue comments and MR comments 1 at a time, instead of page by page. -
Put this change behind a feature flag that is disabled by default. -
Document this solution with the recommended number of issues/MR above which one should consider using this method.
Out of scope
Following features/ideas were discussed, but are not included in this issue:
- Rewriting GitHub Importer to use GraphQL. - not all the needed resources are available in GitHub's GraphQL
- A hybrid solution that utilizes both approaches (paging and fetching 1 at a time), as needed. - too complex
- A solution that analyzes the data and recommends an approach. - a potential future enhancement
Edited by Haris Delalić