Database statement timeout loading merged branches in www-gitlab-com
When @vshumilo loaded https://gitlab.com/gitlab-com/www-gitlab-com/-/merge_requests with Merged date
sort order, she saw a 500 error. https://log.gprd.gitlab.net/goto/df54a310-307b-11ed-b86b-d963a1a6788e shows there is a database statement timeout:
/*application:web,correlation_id:01GCHT4NFJDPX31K07Z4D3QEH2,endpoint_id:Projects::MergeRequestsController#index,db_config_name:main_replica*/ SELECT "merge_requests".*, "merge_request_metrics"."latest_closed_at" AS merge_request_metrics_latest_closed_at, "merge_request_metrics"."id" AS merge_request_metrics_id FROM "merge_requests" INNER JOIN "merge_request_metrics" ON "merge_request_metrics"."merge_request_id" = "merge_requests"."id" WHERE ("merge_requests"."state_id" IN ($1)) AND "merge_request_metrics"."target_project_id" = $2 ORDER BY "merge_request_metrics"."latest_closed_at" ASC NULLS LAST, "merge_request_metrics"."id" DESC LIMIT $4 OFFSET $3
This query has to do an inner join of all merge_request_metrics
and merge_requests
, and then return the first 20. This is likely too expensive of a query.
The backtrace is:
"exception.backtrace": [
"lib/gitlab/database/load_balancing/connection_proxy.rb:104:in `block in read_using_load_balancer'",
"lib/gitlab/database/load_balancing/load_balancer.rb:55:in `read'",
"lib/gitlab/database/load_balancing/connection_proxy.rb:103:in `read_using_load_balancer'",
"lib/gitlab/database/load_balancing/connection_proxy.rb:48:in `select_all'",
"lib/gitlab/issuable_metadata.rb:67:in `map'",
"lib/gitlab/issuable_metadata.rb:67:in `block in issuable_ids'",
"lib/gitlab/utils/strong_memoize.rb:44:in `strong_memoize'",
"lib/gitlab/issuable_metadata.rb:62:in `issuable_ids'",
"lib/gitlab/issuable_metadata.rb:28:in `data'",
"app/controllers/concerns/issuable_collections.rb:54:in `set_pagination'",
"app/controllers/concerns/issuable_collections.rb:27:in `set_issuables_index'",
"ee/lib/gitlab/ip_address_state.rb:10:in `with'",
"ee/app/controllers/ee/application_controller.rb:45:in `set_current_ip_address'",
"app/controllers/application_controller.rb:531:in `set_current_admin'",
"lib/gitlab/session.rb:11:in `with_session'",
Note that even this query takes a full second:
gitlabhq_production=# explain (analyze, buffers) select * from merge_request_metrics where target_project_id = 7764 ORDER BY merged_at DESC LIMIT 20;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.57..2017.34 rows=20 width=140) (actual time=501.330..1046.771 rows=20 loops=1)
Buffers: shared hit=143 read=1351 dirtied=1018
I/O Timings: read=655.764
-> Index Scan Backward using index_merge_request_metrics_on_merged_at on merge_request_metrics (cost=0.57..9778286.96 rows=96970 width=140) (actual time=501.328..1046.743 rows=20 loops=1)
Filter: (target_project_id = 7764)
Rows Removed by Filter: 10922
Buffers: shared hit=143 read=1351 dirtied=1018
I/O Timings: read=655.764
Planning Time: 71.667 ms
Execution Time: 1046.831 ms
(10 rows)
This isn't great, and we might be able to speed it up with an index on merged_at
and target_project_id
. Then we can do the JOIN with the merge_requests
.
Edited by Stan Hu