Performance improvements for gitlab:doctor:secrets
What does this MR do and why?
Performance improvements for gitlab:doctor:secrets
- Use
each_batchinstead offind_eachfor more efficient batching per https://docs.gitlab.com/development/database/iterating_tables_in_batches/. - Only check rows with a not-null encrypted column. This allows us
to skip most
CI::Buildentries.
Credits to https://gitlab.com/mbobin for suggesting this change.
Changelog: performance
Background
The gitlab:doctor:secrets rake task becomes much slower as GitLab usage/data grows. This is mainly due to the number of Ci::Build records analysed by the task.
This MR aims to make the rake task more performant by skipping Ci::Build records with a NULL-token, which is the wast majority of records (the token is set to NULL once a CI job finishes).
References
Closes Investigate gitlab:doctor:secrets rake task spe... (#518702 - closed)
Performance Evaluation
Setup
- Fresh Omnibus install.
- Ubuntu 22.04 lts amd64, 4 vCPU, 16GB RAM (e2-standard-4 GCP VM).
- GitLab 17.11.1
Methodology
- Create a project and pipeline via UI.
- Check that Ci::Build
token_encryptedis nil after a job completes. - Duplicate and save a Ci::Build entry N times (via a rails console).
- Measure (real) time with
time sudo gitlab-rake gitlab:doctor:secrets.
Results
| CI::Build.count | Runtime with unpatched gitlab:doctor:secrets
|
Runtime with patched gitlab:doctor:secrets
|
Improvement |
|---|---|---|---|
| 10k | 2m18s | 1m | 1m18s (~56%) |
| 50k | 9m19s | 1m3s | 8m16s (~88%) |
| 100k | 20m14s | 1m17s | 18m57s (~93%) |
How to set up and validate locally
MR acceptance checklist
Evaluate this MR against the MR acceptance checklist. It helps you analyze changes to reduce risks in quality, performance, reliability, security, and maintainability.
Edited by Clemens Beck