Slow MySQL query on CI pipelines results in timeout on /branches page of repo
Summary
A slow SQL query results in a 502 error when visiting the ../branches page of a specific repo.
Example query
Project.find_by_full_path('root/build-test').pipelines.latest_successful_for_refs(Project.find_by_full_path('root/build-test').repository.branches.map(&:name))
D, [2017-07-03T04:41:39.078580 #12468] DEBUG -- : Ci::Pipeline Load (1.8ms) SELECT `ci_pipelines`.* FROM `ci_pipelines` WHERE `ci_pipelines`.`project_id` = 1 AND `ci_pipelines`.`status` = 'success' AND `ci_pipelines`.`ref` = 'master' AND `ci_pipelines`.`id` IN (SELECT max(`ci_pipelines`.id) FROM `ci_pipelines` WHERE `ci_pipelines`.`project_id`
Result from customer's instance
SELECT 'ci_pipelines'.* FROM 'ci_pipelines' WHERE <snip>
2526 rows in set (3 min 3.13 sec)
Troubleshooting steps taken and relevant logs and/or screenshots
Links are GitLab internal only
- Reproduced this while tailing the logs. This revealed Unicorn processes exceeding memory limit. See output in gitlab-logs.
- We increased worker timeout and set memory limits in
gitlab.rbwith no success.
unicorn['worker_timeout'] = 120
unicorn['worker_memory_limit_min'] = "600 * 1 << 20"
unicorn['worker_memory_limit_max'] = "700 * 1 << 20"
- We profiled the URL
../branchesand saw a MySQL query taking a long time. See the ruby-prof-call-tree. - We then dumped the MySQL queries to file. See output in active_sql_record.log.
- This showed the slow SQL query mentioned above. We profiled the query. See output in gitlab_sql_profile.log.
Links
Edited by Collen