Slow MySQL query on CI pipelines results in timeout on /branches page of repo

Summary

A slow SQL query results in a 502 error when visiting the ../branches page of a specific repo.

Example query

Project.find_by_full_path('root/build-test').pipelines.latest_successful_for_refs(Project.find_by_full_path('root/build-test').repository.branches.map(&:name))
D, [2017-07-03T04:41:39.078580 #12468] DEBUG -- :   Ci::Pipeline Load (1.8ms)  SELECT `ci_pipelines`.* FROM `ci_pipelines` WHERE `ci_pipelines`.`project_id` = 1 AND `ci_pipelines`.`status` = 'success' AND `ci_pipelines`.`ref` = 'master' AND `ci_pipelines`.`id` IN (SELECT max(`ci_pipelines`.id) FROM `ci_pipelines` WHERE `ci_pipelines`.`project_id` 

Result from customer's instance

SELECT 'ci_pipelines'.* FROM 'ci_pipelines' WHERE <snip>
2526 rows in set (3 min 3.13 sec)

Troubleshooting steps taken and relevant logs and/or screenshots

Links are GitLab internal only

  • Reproduced this while tailing the logs. This revealed Unicorn processes exceeding memory limit. See output in gitlab-logs.
  • We increased worker timeout and set memory limits in gitlab.rb with no success.
unicorn['worker_timeout'] = 120
unicorn['worker_memory_limit_min'] = "600 * 1 << 20"
unicorn['worker_memory_limit_max'] = "700 * 1 << 20"
  • We profiled the URL ../branches and saw a MySQL query taking a long time. See the ruby-prof-call-tree.
  • We then dumped the MySQL queries to file. See output in active_sql_record.log.
  • This showed the slow SQL query mentioned above. We profiled the query. See output in gitlab_sql_profile.log.

Links

Zendesk: https://gitlab.zendesk.com/agent/tickets/79448

Edited Jul 05, 2017 by Collen
Assignee Loading
Time tracking Loading