Skip to content

checks: Implement infrastructure for bulk diff checks

Patrick Steinhardt requested to merge pks-bulk-diff-checks into master

What does this MR do?

The diff checks are quite expensive to compute given that we have to enumerate all new commits for each updated or new branch, which translates to one call to git rev-list $newrev --not --all. Performance linearly grows with the number of existing branches in the target repository: in the context of repos with hundreds of thousands of branches, it can easily take multiple seconds up to a few dozens of seconds. It doesn't help that we do this call once per reference.

This MR creates the infrastructure to do diff checks in bulk, so we can preleoad all new commits once and then pass these commits to the BulkCheck class. This avoids having to call the expensive ListCommits RPC multiple times.

With the bulk load, we do have one problem though: the BulkCheck class expects to get only those new commits which are in fact relevant to the change at hand. Luckily, we can rather extract all relevant commits from the set of new commits by walking through all new commits starting with the change's newrev. When we hit a commit ID which isn't part of our new commits, then we know that the commit must've been a preexisting commit in the repository.

Note that this allows the user to update a branch to a preexisting commit in the case where the diff between old and new commit would've included changes prohibited by DiffCheck. This is a preexisting twist though given that we'd previously have retrieved all new commits via doing git rev-list $newrev --not --all, which wouldn't return reachable old commits. So this is not a change in behaviour.

Part of #330327 (closed)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Patrick Steinhardt

Merge request reports