Estimate database rate requirements
Context
This is a followup from gitlab#207147 (comment 310427052).
As we are working on switching the Container Registry API metadata write and read operations from the filesystem into a PostgreSQL database, we need to estimate the rate requirements. Results should be reviewed and discussed with the Database team.
Current API Request Rate
The average request rate of the Container Registry HTTP API will serve as a baseline to extrapolate the rate requirements for the upcoming metadata database. Once determined how many requests over a given time frame the API is serving, we can extrapolate how many SQL operations would those requests require.
Methodology
-
Per-second rate metrics collected twice (averaged) at 2020/06/19 18:00 UTC and 2020/06/25 09:00 UTC from Prometheus, looking at the previous 48 hours (business days only, for a worst-case estimate);
-
Per-minute, hourly, daily, monthly and yearly rates calculated by applying a multiplication factor to the per-second rate;
-
Metrics grouped by API handler (e.g. manifests, tags, etc.) and HTTP method;
-
Handler/method pairs that do not imply any database operations were ignored;
-
Rates rounded to the nearest integer, unless less than zero, in which case two decimal places were used;
-
Requests that ended with a
500
status code (Internal Server Error) were ignored. These are negligible (~0.03% of the eligible requests), and we have no way to know if they would imply any database operations (we do not know at which point they failed and why); -
Requests that ended with a
401
status code (Unauthorized) were ignored (~16% of the eligible requests), as they are promptly rejected and do not require any database operations; -
Requests that ended with a
404
status code (Not Found) were accounted for as successful requests. Similar to500
, we have no way to determine how many database operations would be required for these, but they are not negligible (~2% of the eligible requests). This is a worst-case estimate. -
PromQL query:
sort_desc( sum by(handler, method)( rate( registry_http_requests_total{ cluster="gprd-gitlab-gke", namespace="gitlab", code!="500", code!="401" }[48h] )))
Results
Handler | Method | Description | Per-second | Per-minute | Hourly | Daily | Weekly | Monthly | Yearly |
---|---|---|---|---|---|---|---|---|---|
Manifest | GET |
Download manifest. | 141 | 8,469 | 508,112 | 12,194,681 | 85,362,768 | 2,560,883,042 | 30,730,596,502 |
Blob | GET |
Download blob. | 74 | 4,411 | 264,664 | 6,351,934 | 44,463,536 | 1,333,906,094 | 16,006,873,132 |
Tags | GET |
List tags. | 52 | 3,133 | 187,982 | 4,511,557 | 31,580,898 | 947,426,931 | 11,369,123,175 |
Blob | HEAD |
Check existence of blob. | 21 | 1,282 | 76,914 | 1,845,927 | 12,921,488 | 387,644,649 | 4,651,735,785 |
Manifest | HEAD |
Check existence of manifest. | 8 | 473 | 28,404 | 681,687 | 4,771,808 | 143,154,255 | 1,717,851,056 |
Blob Upload | PUT |
Complete blob upload. | 5 | 295 | 17,712 | 425,091 | 2,975,638 | 89,269,140 | 1,071,229,676 |
Manifest | PUT |
Upload manifest. | 2 | 102 | 6,118 | 146,824 | 1,027,767 | 30,832,997 | 369,995,959 |
Manifest | DELETE |
Delete manifest. | 0.03 | 2 | 114 | 2,741 | 19,184 | 575,521 | 6,906,256 |
Tag | DELETE |
Delete tag. | 0.03 | 2 | 113 | 2,705 | 18,934 | 568,019 | 6,816,234 |
Total | 303 | 18,169 | 1,090,131 | 26,163,146 | 183,142,022 | 5,494,260,648 | 65,931,127,774 |
Database Operations per API Request
To estimate the database operation rate, first, we need to determine the type (SELECT
, INSERT
, UPDATE
, DELETE
) and the number of database operations required to fulfil each API handler/method request.
Methodology
The worst-case scenario for each handler/method pair was assumed; in other words, it is assumed that we would need to perform the highest possible number of database operations for each request:
- Even though some write requests may target one or more existing resources, in which case an
INSERT
to create those would not be needed, we assume that we will always need to create them; - Even though some read requests may target a non-existing resource, in which case the operation would not need to perform all possible database operations, we assume that all resources exist.
Results
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
Execution Flow |
---|---|---|---|---|---|---|
Manifest | GET |
3 | 0 | 0 | 0 | 1. Find repository (1 SELECT );2. Find tag in repository (1 SELECT );3. Find manifest associated with tag (1 SELECT ). |
Blob | GET |
2 | 0 | 0 | 0 | 1. Find repository (1 SELECT );2. Check if blob is linked in repository (1 SELECT ); |
Tags | GET |
2 | 0 | 0 | 0 | 1. Find repository (1 SELECT );2. Find repository tags (1 SELECT ). |
Blob | HEAD |
2 | 0 | 0 | 0 | 1. Find repository (1 SELECT );2. Check if blob is linked in repository (1 SELECT ); |
Manifest | HEAD |
2 | 0 | 0 | 0 | 1. Find repository (1 SELECT );2. Check if manifest is linked in repository (1 SELECT ); |
Blob Upload | PUT |
1 | 3 | 0 | 0 | 1. Create repository. Assuming it already exists (worst-case), the insert fails, and we need to find it (1 INSERT , 1 SELECT );2. Create blob (1 INSERT );3. Link blob in repository (1 INSERT ). |
Manifest | PUT |
1 | 4 | 0 | 0 | 1. Create repository. Assuming it already exists (worst-case), the insert fails, and we need to find it (1 INSERT , 1 SELECT );2. Create blob (1 INSERT );3. Link blob in repository (1 INSERT ).4. Tag manifest (1 INSERT ). |
Catalog | GET |
1 | 0 | 0 | 0 | 1. Find repositories (1 SELECT ). |
Manifest | DELETE |
1 | 0 | 0 | 1 | 1. Find target repository: 1 SELECT; 2. Unlink manifest from repository: 1 DELETE. |
Tag | DELETE |
1 | 0 | 0 | 1 | 1. Find repository (1 SELECT );2. Untag manifest (1 DELETE ). |
Estimated Database Rate
After collecting the current production API request rate and estimating the required database operations per request, we can finally estimate the database operation rate requirements.
Methodology
The API request rate was multiplied by the estimated number of database operations for each handler/method pair, across all ranges.
Results
Detailed Breakdown
Per-second
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
423 | 0 | 0 | 0 |
Blob | GET |
147 | 0 | 0 | 0 |
Tags | GET |
104 | 0 | 0 | 0 |
Blob | HEAD |
43 | 0 | 0 | 0 |
Manifest | HEAD |
16 | 0 | 0 | 0 |
Blob Upload | PUT |
5 | 15 | 0 | 0 |
Manifest | PUT |
2 | 7 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
0 | 0 | 0 | 0 |
Tag | DELETE |
0 | 0 | 0 | 0 |
Total | 740 | 22 | 0 | 0 |
Per-minute
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
25,406 | 0 | 0 | 0 |
Blob | GET |
8,822 | 0 | 0 | 0 |
Tags | GET |
6,266 | 0 | 0 | 0 |
Blob | HEAD |
2,564 | 0 | 0 | 0 |
Manifest | HEAD |
947 | 0 | 0 | 0 |
Blob Upload | PUT |
295 | 886 | 0 | 0 |
Manifest | PUT |
102 | 408 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
2 | 0 | 0 | 2 |
Tag | DELETE |
2 | 0 | 0 | 2 |
Total | 44,405 | 1,293 | 0 | 4 |
Hourly
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
1,524,335 | 0 | 0 | 0 |
Blob | GET |
529,328 | 0 | 0 | 0 |
Tags | GET |
375,963 | 0 | 0 | 0 |
Blob | HEAD |
153,827 | 0 | 0 | 0 |
Manifest | HEAD |
56,807 | 0 | 0 | 0 |
Blob Upload | PUT |
17,712 | 53,136 | 0 | 0 |
Manifest | PUT |
6,118 | 24,471 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
114 | 0 | 0 | 114 |
Tag | DELETE |
113 | 0 | 0 | 113 |
Total | 2,664,317 | 77,607 | 0 | 227 |
Daily
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
36,584,043 | 0 | 0 | 0 |
Blob | GET |
12,703,868 | 0 | 0 | 0 |
Tags | GET |
9,023,114 | 0 | 0 | 0 |
Blob | HEAD |
3,691,854 | 0 | 0 | 0 |
Manifest | HEAD |
1,363,374 | 0 | 0 | 0 |
Blob Upload | PUT |
425,091 | 1,275,273 | 0 | 0 |
Manifest | PUT |
146,824 | 587,295 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
2,741 | 0 | 0 | 2,741 |
Tag | DELETE |
2,705 | 0 | 0 | 2,705 |
Total | 63,943,613 | 1,862,569 | 0 | 5,445 |
Weekly
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
256,088,304 | 0 | 0 | 0 |
Blob | GET |
88,927,073 | 0 | 0 | 0 |
Tags | GET |
63,161,795 | 0 | 0 | 0 |
Blob | HEAD |
25,842,977 | 0 | 0 | 0 |
Manifest | HEAD |
9,543,617 | 0 | 0 | 0 |
Blob Upload | PUT |
2,975,638 | 8,926,914 | 0 | 0 |
Manifest | PUT |
1,027,767 | 4,111,066 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
19,184 | 0 | 0 | 19,184 |
Tag | DELETE |
18,934 | 0 | 0 | 18,934 |
Total | 447,605,289 | 13,037,980 | 0 | 38,118 |
Monthly
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
7,682,649,125 | 0 | 0 | 0 |
Blob | GET |
2,667,812,189 | 0 | 0 | 0 |
Tags | GET |
1,894,853,862 | 0 | 0 | 0 |
Blob | HEAD |
775,289,298 | 0 | 0 | 0 |
Manifest | HEAD |
286,308,509 | 0 | 0 | 0 |
Blob Upload | PUT |
89,269,140 | 267,807,419 | 0 | 0 |
Manifest | PUT |
30,832,997 | 123,331,986 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
575,521 | 0 | 0 | 575,521 |
Tag | DELETE |
568,019 | 0 | 0 | 568,019 |
Total | 13,428,158,661 | 391,139,405 | 0 | 1,143,541 |
Yearly
Handler | Method | SELECT |
INSERT |
UPDATE |
DELETE |
---|---|---|---|---|---|
Manifest | GET |
92,191,789,505 | 0 | 0 | 0 |
Blob | GET |
32,013,746,263 | 0 | 0 | 0 |
Tags | GET |
22,738,246,350 | 0 | 0 | 0 |
Blob | HEAD |
9,303,471,570 | 0 | 0 | 0 |
Manifest | HEAD |
3,435,702,113 | 0 | 0 | 0 |
Blob Upload | PUT |
1,071,229,676 | 3,213,689,027 | 0 | 0 |
Manifest | PUT |
369,995,959 | 1,479,983,836 | 0 | 0 |
Catalog | GET |
0 | 0 | 0 | 0 |
Manifest | DELETE |
6,906,256 | 0 | 0 | 6,906,256 |
Tag | DELETE |
6,816,234 | 0 | 0 | 6,816,234 |
Total | 161,137,903,926 | 4,693,672,863 | 0 | 13,722,490 |
Aggregated Total
Period | SELECT |
INSERT |
UPDATE |
DELETE |
Total |
---|---|---|---|---|---|
Per-second | 740 | 22 | 0 | 0 | 762 |
Per-minute | 44,405 | 1,293 | 0 | 4 | 45,703 |
Hourly | 2,664,317 | 77,607 | 0 | 227 | 2,742,151 |
Daily | 63,943,613 | 1,862,569 | 0 | 5,445 | 65,811,627 |
Weekly | 447,605,289 | 13,037,980 | 0 | 38,118 | 460,681,387 |
Monthly | 13,428,158,661 | 391,139,405 | 0 | 1,143,541 | 13,820,441,607 |
Yearly | 161,137,903,926 | 4,693,672,863 | 0 | 13,722,490 | 165,845,299,279 |
Notes
-
We only accounted for API requests in this estimate. Apart from the API, the garbage collector will also need to talk with the database. Although it's difficult to quantify the rate arising from that interaction at this point, it should be negligible when compared with the API requests. Given this is a relatively pessimistic worst-case estimate, it should be more than enough to account for the requirements of any other background job.
-
The writes workload is very light when compared to reads. We thought we could be missing something here, but results were consistent across runs, and it matches what we can see in the Grafana dashboard. This may show how much users rely on the registry to store base images that are heavily reused (read) across CI runs.
-
Although the
DELETE
load is low at the moment, we expect this to increase substantially once the tag expiration policy feature becomes available for all existing projects on GitLab.com (gitlab#196124 (closed)).