Skip to content

Backend: Group files by projects in config_file_project_validate_access

Summary

  • Fix N+1 project call
  • Fix N+1 sha call

For config_file_project_validate_access, each file is currently checked in a project which can lead to slow performance within the CI Lint API.

Tree Breakdown below:

LINT:

pipeline_creation_service ->
  yaml_process ->
    config_compose ->
      config_build_context ->
        config_build_variables
      config_expand ->
        config_yaml_load
        config_external_process ->
          config_mapper_process ->
            config_mapper_normalize ->
              config_mapper_variables
            config_mapper_rules
            config_mapper_wildcards
            config_mapper_variables
            config_mapper_select
            !verify!
              config_file_artifact_validate_context # not in ci lint
              ->>>>config_file_project_validate_access
              config_file_fetch_content ->
                config_file_fetch_local_content
                config_file_fetch_project_content
                config_file_fetch_remote_content
                config_file_fetch_template_content
              config_file_validate_content
              config_file_fetch_content_hash
              config_file_expand_content_includes ->
                config_mapper_process...
                config_external_verify
                config_external_merge
        config_yaml_extend
        config_tags_resolve
        config_stages_inject
      config_compose ->

The code we track is this;

def can_access_local_content?
  strong_memoize(:can_access_local_content) do
    context.logger.instrument(:config_file_project_validate_access) do
      Ability.allowed?(context.user, :download_code, project)
    end
  end
end

Proposal

Even if we use strong_memoize here, improvements can be made for can_access_local_content? by grouping files by projects and operating similarly to the work being described in #382531 (closed) with batching.

Additional details

Some relevant technical details, if applicable, such as:

  • Does this need a feature flag?
  • Is there an example response showing the data structure that should be returned (new endpoints only)?
  • What permissions should be used?
  • Is this EE or CE?
    • EE
    • CE
  • Additional comments:

Implementation Table

Work Type Description Issue link
NOTE: 🚨 All below issues can be done in parallel
backend Backend: The gitlab-ci.yml is limited to 100 includes #207270 (closed)
backend Backend: Remove N+1 for Gitaly requests when fetching includes #344829 (closed)
backend frontend Improve the error messaging when fetching remote includes are timing out #351168 (closed)
backend Backend: Improve CI Linter performance through parallelizing HTTP calls #351250 (closed)
backend Backend: Caching includes to improve performance when using remote includes #351252
backend Backend: Batch request calls to Gitaly when fetching include #382531 (closed)
backend Backend: Group files by projects in config_file_project_validate_access 👈 You are here

Links/References

Edited by Furkan Ayhan