GitLab for Jira app improve initial sync: Sync commits
Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.
About
When a new group is connected to Jira, we sync some historical data per project in the group, but we do not sync historic commit data, documented here.
Technical problem
There are some known N+1 performance problems in the Atlassian::JiraConnect::Serializers::CommitEntity, which is that for every commit we serialize to sync with Jira, we perform multiple additional Gitaly calls.
There are some outstanding issues to introduce new Gitaly RPC methods that we could use to get certain data from Gitaly in a single call, rather than a call per commit:
- Gitaly: Add batch version of CommitStats RPC (gitaly#3375)
- Gitaly: Add support for multiple commits (batch... (gitaly#3374)
Those N+1 performance problems are currently marked as blockers for this issue.
Proposal
Although the Gitaly RPC issues are marked as blockers, we could possibly make unperformant N+1 calls to Gitaly if we limit the number of commits we sync, for example only the latest 400 commits made to the repository.
This would work if:
- Limiting the initial sync of data to the latest
400commits would still be useful for our customers. - Making up to ~1,200 Gitaly calls per project that we sync in a group was okay for Category:Gitaly. We sync all projects in a group, but we process 1 project per minute. This would mean we would make up to an additional ~1,200 Gitaly calls per minute.