Skip to content

Backend: Refactor Rule::Clause::Exists to improve performance

Summary

In !148356 (merged) we introduced new subkeys to rules:exists: paths, project, ref. This first iteration is not optimized for performance and could have problems in the future if rules:exists:project becomes widely used.

The code that supports the new subkeys introduces several new database and Gitaly calls, including ones to fetch the project and sha, and to check user permissions. If a pipeline has many nested includes using rules:exists:project, this could cause a noticeable performance degradation.

Further context:

The following discussion from !148356 (merged) should be addressed:

  • @lma-git started a discussion:

    @furkanayhan: I don't want to block this MR but I am scared that we'll face similar problems that we had in includes before with performance. At includes, we improved the performance by caching/memoizing/batch-loading project, permissions, etc. Do you think we should improve this area first before introducing this? I am asking this because we are implementing a logic that we fetch project and commit then check for permission and then create a context for each rule:exists.

    This reminded me of #351593 (closed) and #450687.

    @lmg-git: We can open a follow up issue to implement something like what you did with file.preload_context in Mapper::Verifier. I think we can adopt a similar approach in Mapper::Filter, but it definitely requires more consideration because of how the rules/clauses are loaded. I think it would also take a few iterations because I'd like to do refactoring like in https://gitlab.com/gitlab-org/g itlab/-/issues/454384 first to clean up the code.

Proposal

Refactor Rule::Clause::Exists and related classes to improve performance. Utilize techniques similar to the ones employed for batch requesting include files: e.g. batch/preloading, memoizing, caching.