Groups Merge Request List API seems to be parsing all Merge Requests globally, leading to notable performance inefficiencies

As part of performance testing the Groups Merge Request List API we found some discrepancies in our results data. Namely on select environments the endpoint appeared to perform notably worse.

After extensive investigation it appears that the endpoint has a notable inefficiency, with it parsing all Merge Requests on the environment - Even ones that aren't found under the group being polled.

Furthermore this inefficiently appears to be exponential with the endpoint degrading significantly as the number of MRs increase. This is illustrated in the numbers on each environment:

  • 10k AWS: Additional copy of our GitLab project - 7218 MRs total, TTFB P90 Performance result of around 11s
  • 50k GCP: A copy of the Linux project - 10519 MRs total, TTFB P90 performance result of 30s
  • Standard Data: 3609 MRs total, TTFB P90 performance result of around 300ms - 2s (Database Load Balancing contributes here to the lower end)

After removing the additional copy of the GitLab project on the 10k AWS environment, reducing the number of MRs back down to 3609, the endpoint dropped notably from 11s to around 300ms:

Additional copy of GitLab:
█ Results summary

* Environment:                10k
* Environment Version:        15.5.0-pre `e916d96e00c`
* Option:                     60s_200rps
* Date:                       2022-10-11
* Run Time:                   1m 7.35s (Start: 11:01:27 UTC, End: 11:02:34 UTC)
* GPT Version:                v2.11.0

NAME                         | RPS   | RPS RESULT         | TTFB AVG  | TTFB P90              | REQ STATUS     | RESULT
-----------------------------|-------|--------------------|-----------|-----------------------|----------------|-----------------
api_v4_groups_merge_requests | 200/s | 22.16/s (>16.00/s) | 8191.91ms | 11152.18ms (<17500ms) | 100.00% (>99%) | Passed¹

Additional copy of GitLab REMOVED:
█ Results summary

* Environment:                10k
* Environment Version:        15.5.0-pre `e916d96e00c`
* Option:                     60s_200rps
* Date:                       2022-10-11
* Run Time:                   1m 3.77s (Start: 11:17:41 UTC, End: 11:18:45 UTC)
* GPT Version:                v2.11.0

NAME                         | RPS   | RPS RESULT          | TTFB AVG | TTFB P90            | REQ STATUS     | RESULT
-----------------------------|-------|---------------------|----------|---------------------|----------------|-----------------
api_v4_groups_merge_requests | 200/s | 196.68/s (>16.00/s) | 187.21ms | 211.85ms (<17500ms) | 100.00% (>99%) | Passed¹

So a jump here from 3609 to 7218 MRs leads to an almost 11s performance hit.

Additionally, the queries that look to be the cause on both of the affected environments are:

50k GCP:
      total_hours      |   total_seconds    |       avg_millis       | num_calls |                                                  query
-----------------------+--------------------+------------------------+-----------+----------------------------------------------------------------------------------------------------------
     9.461682561586697 |  34062.05722171211 |      5.201520355898108 |      2559 | /*application:web,correlation_id:01GEPBX8ZB5YE6DBFWY3JG3M55,endpoint_id:GET /api/:version/groups/:id/mer.
                       |                    |                        |           |.ge_requests,db_config_name:main*/ SELECT "merge_requests".* FROM "merge_requests" INNER JOIN "projects" .
                       |                    |                        |           |.ON "projects"."id" = "merge_requests"."target_project_id" LEFT JOIN project_features ON projects.id = pr.
                       |                    |                        |           |.oject_features.project_id WHERE "projects"."namespace_id" IN (SELECT namespaces.traversal_ids[array_leng.
                       |                    |                        |           |.th(namespaces.traversal_ids, $1)] AS id FROM "namespaces" WHERE "namespaces"."type" = $2 AND (traversal_.
                       |                    |                        |           |.ids @> ($3))) AND (EXISTS (SELECT $4 FROM "project_authorizations" WHERE "project_authorizations"."user_.
                       |                    |                        |           |.id" = $5 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN ($6,$7,$.
                       |                    |                        |           |.8)) AND ("project_features"."merge_requests_access_level" > $9 OR "project_features"."merge_requests_acc.
                       |                    |                        |           |.ess_level" IS NULL) AND "projects"."archived" = $10 ORDER BY "merge_requests"."id" DESC LIMIT $11 OFFSET.
                       |                    |                        |           |. $2129838 

10k AWS:
     1.0532090130294443 |  3791.5524469059997 |   7999.055795160336 |       474 | /*application:web,correlation_id:01GF0T4368DNCKQNGGVN53YZ8F,endpoint_id:GET /api/:version/groups/:id/merg.
                        |                     |                     |           |.e_requests,db_config_name:main_replica*/ SELECT "merge_requests".* FROM "merge_requests" INNER JOIN "proj.
                        |                     |                     |           |.ects" ON "projects"."id" = "merge_requests"."target_project_id" LEFT JOIN project_features ON projects.id.
                        |                     |                     |           |. = project_features.project_id WHERE "projects"."namespace_id" IN (SELECT namespaces.traversal_ids[array_.
                        |                     |                     |           |.length(namespaces.traversal_ids, $1)] AS id FROM "namespaces" WHERE "namespaces"."type" = $2 AND (travers.
                        |                     |                     |           |.al_ids @> ($3))) AND (EXISTS (SELECT $4 FROM "project_authorizations" WHERE "project_authorizations"."use.
                        |                     |                     |           |.r_id" = $5 AND (project_authorizations.project_id = projects.id)) OR projects.visibility_level IN ($6,$7,.
                        |                     |                     |           |.$8)) AND ("project_features"."merge_requests_access_level" > $9 OR "project_features"."merge_requests_acc.
                        |                     |                     |           |.ess_level" IS NULL) AND "projects"."archived" = $10 ORDER BY "merge_requests"."id" DESC LIMIT $11 OFFSET .
                        |                     |                     |           |.$12

To reiterate the test we run here is to ask for all MRs of a specific Group that contains 1 copy of the GitLab project. Also of note this drop in performance wasn't seen on other endpoints such as Group Issues List API (where there is also additional data on the affected environments above).

Since this issue is one more of an inefficient query it doesn't fit nicely into our performance targets, but due to the exponential performance drop of the endpoint I'll raise this as a severity2 to start but happy to discuss.

Edited by 🤖 GitLab Bot 🤖