Skip to content

Many requests take almost exactly 5 seconds in the persistent Redis

While investigating #1115 (closed) I noticed that there are a lot of requests that take almost exactly 5 seconds in the persistent Redis. This is the distribution of requests that spent over a second in that Redis: https://log.gprd.gitlab.net/goto/7c13d18e8c480d5d46240630fbfef1b9

image

There's an expected drop off over time as the durations increase, except for a large spike at 5 seconds.

If we search for requests where we take between 5 and 5.1 seconds in this Redis instance we see https://log.gprd.gitlab.net/goto/3efb6ff114d0791caf6aa7a8a5d19230:

json.meta.caller_id.keyword: Descending Count
POST /api/:version/internal/allowed 9,095
Repositories::GitHttpController#info_refs 8,578
Repositories::GitHttpController#git_upload_pack 2,340
PATCH /api/:version/jobs/:id/trace 1,089
GET /api/:version/projects/:id/pipelines 993
POST /api/:version/jobs/request 738
GET /api/:version/projects/:id/jobs 578
PUT /api/:version/jobs/:id 544
GraphqlController#execute 472
Other 6,073

This seems to be mostly reflective of (authenticated?) request rate, though.

We do not see these volumes in the slowlog - we could be in trouble if we had Redis commands taking 5 seconds of CPU time in Redis anyway! - so this may be a Ruby side artifact, either in measurement or affecting actual requests.

https://log.gprd.gitlab.net/goto/d7dca52e69873375c6ad37ab0bd0ba6c shows this happens somewhat regularly for people in the gitlab-org/gitlab group. Most of those are for API requests, which likely won't store performance bar data, but if we can grab a browser request's correlation ID quickly enough we might get lucky and be able to add it to the performance bar.

Edited by Sean McGivern