Slow download_code permission check
Summary
During pipeline creation, when fetching CI/CD configuration files, we call Ability.allowed?(user, :download_code, project) to validate access to included files. This permission check is taking an unexpectedly long time (5-10+ seconds in some cases), contributing significantly to the overall configuration fetch timeout issues reported in #588313.
Namespace Distribution is Interesting
https://log.gprd.gitlab.net/app/r/s/3I0gi
The logs show that this issue disproportionately affects specific namespaces rather than being evenly distributed. This suggests the slowness may be related to:
- Namespace/group complexity: Namespaces with deep group hierarchies or complex membership structures may require more database lookups to resolve permissions
- Number of members: Large organizations with many members might have slower permission resolution
- Custom roles or fine-grained permissions: More complex permission configurations could add overhead
- Group/project inheritance depth: Deep inheritance chains for permissions might cause more traversal
- Frequency of pipeline creation: High-activity namespaces naturally appear more in logs
Difficult to Reproduce
I attempted to reproduce the slowness on the production console using the same project and user from the logs, but the permission check ran fast. This suggests the slowness may be:
- Intermittent / timing-dependent: Related to database load, connection pool saturation, or resource contention at specific times
- Context-dependent: Something specific to the Sidekiq worker context vs. console context (e.g., different connection pools, timeouts, or middleware)
- Cumulative: The slowness might only manifest after many sequential permission checks within the same request, possibly due to memory pressure or accumulated database connections
- Cache state dependent: The console run may have benefited from warmed caches that weren't available during the actual pipeline creation
Edited by 🤖 GitLab Bot 🤖