Speed up pipelines API (!21052) · Merge requests · GitLab.org / GitLab

What does this MR do?

Make /api/projects/:id/pipelines a bit faster. The change is behind a feature flag (fast_pipelines_api), if you want to try it out on your local gdk installation, make sure you enable it: Feature.enable(:fast_pipelines_api).

Reasoning

In production we're getting above 115rps and the average response time is 60ms, this is one of the most frequently called endpoint.

How:

The format of response is not too complex, moving the JSON serialization to PostgreSQL is doable with a little extra complexity.

Pros:

Less time spent serving requests for the pipeline.
Less memory is used since we don't create AR models and don't serialize data in ruby.

Cons:

Introducing a new way of serializing JSON which might increase the cognitive load when making changes to the API.

Based on the results I got from the perf test, I think it's worth to try it on production and see how much difference it makes.

Move the JSON generation to PostgreSQL:

Old query:

select * from ci_pipelines 
where project_id = 278964 and ref= 'master' 
order by id desc limit 20;

New query (simplified):

 SELECT Array_to_json(Array_agg(Row_to_json(ci_pipelines))) AS result
FROM   (
                SELECT   "ci_pipelines"."id",
                         "ci_pipelines"."sha",
                         "ci_pipelines"."ref",
                         "ci_pipelines"."status",
                         "ci_pipelines"."created_at",
                         "ci_pipelines"."updated_at",
                         Replace('http://localhost:3001/gitlab-org/gitlab-test/pipelines/PIPELINE_ID', 'PIPELINE_ID', Cast("ci_pipelines"."id" AS VARCHAR)) AS web_url
                FROM     "ci_pipelines"
                WHERE    "ci_pipelines"."project_id" = 278964
                ORDER BY "ci_pipelines"."ref" DESC limit 20) ci_pipelines

How

We have the "base" query that takes N records from the database:

select * from ci_pipelines 
where project_id = 278964 and ref= 'master' 
order by id desc limit 20;

Let's replace the * with the exact columns we need to match the current API schema:

SELECT 
"ci_pipelines"."id", 
"ci_pipelines"."sha", 
"ci_pipelines"."ref", 
"ci_pipelines"."status", 
TO_CHAR("ci_pipelines"."created_at", 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') AS created_at,
TO_CHAR("ci_pipelines"."updated_at", 'YYYY-MM-DD"T"HH24:MI:SS.MS"Z"') AS updated_at, 
REPLACE('http://localhost/namespace160/project160/pipelines/PIPELINE_ID', 'PIPELINE_ID', CAST("ci_pipelines"."id" AS VARCHAR)) AS web_url
FROM ci_pipelines 
WHERE project_id = 278964 and ref= 'master' 
ORDER BY id desc limit 20;

Timestamps: we need a custom formatting to match ruby's JSON date serialization format.
Url: we pass a dummy placeholder when generating the url and replace it in SQL with the real pipeline id: Gitlab::Routing.url_helpers.project_pipeline_url(project, 'PIPELINE_ID')

Instead of selecting records from ci_pipelines, select records from the query above:

SELECT
ARRAY_TO_JSON(ARRAY_AGG(ROW_TO_JSON(ci_pipelines))) AS result 
FROM (QUERY_ABOVE) ci_pipelines

Turn each row to JSON.
Aggregate the JSON rows to array. (result: 1 row)
Convert the array to JSON blob. (result: 1 row)

Performance

I've used apache benchmark to measure the API performance (RAILS_ENV=production).

ab -n 1000 -c 5 -H "PRIVATE-TOKEN: YOUR_TOKEN" "http://localhost:3001/api/v4/projects/1/pipelines?per_page=100"

API Performance with the old query:

Server Software:
Server Hostname:        localhost
Server Port:            3001

Document Path:          /api/v4/projects/1/pipelines?per_page=100
Document Length:        24479 bytes

Concurrency Level:      5
Time taken for tests:   18.047 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      25243910 bytes
HTML transferred:       24479000 bytes
Requests per second:    55.41 [#/sec] (mean)
Time per request:       90.235 [ms] (mean)
Time per request:       18.047 [ms] (mean, across all concurrent requests)
Transfer rate:          1366.01 [Kbytes/sec] received

API Performance with new query:

Server Software:
Server Hostname:        localhost
Server Port:            3001

Document Path:          /api/v4/projects/1/pipelines?per_page=100
Document Length:        24479 bytes

Concurrency Level:      5
Time taken for tests:   8.879 seconds
Complete requests:      1000
Failed requests:        0
Total transferred:      25243906 bytes
HTML transferred:       24479000 bytes
Requests per second:    112.63 [#/sec] (mean)
Time per request:       44.394 [ms] (mean)
Time per request:       8.879 [ms] (mean, across all concurrent requests)
Transfer rate:          2776.53 [Kbytes/sec] received

Raw performance comparison without involving Rails's web stack:

RAILS_ENV=production rails runner pipelines_api_perf_check.rb

pipelines_api_perf_check.rb

              user     system      total        real
grape entity 32.409264   0.032746  32.442010 ( 32.554918)
sql entity  2.234352   0.011249   2.245601 (  2.748364)

Conformity

Availability and Testing

Review and add/update tests for this feature/bug. Consider all test levels. See the Test Planning Process.
Tested in all supported browsers

Security

Since we're taking data directly from the database, extra care is necessary when selecting the columns. I used Arel to build the query.

Edited Dec 28, 2019 by Adam Hegyi

Speed up pipelines API