Optimize the garbage collector sweep phase for GCS (!84) · Merge requests · GitLab.org / container-registry

João Pereira requested to merge optimize-gc-sweep-stage-gcs into release/2.8-gitlab Feb 18, 2020

This is related to gitlab#201963 (closed).

The garbage collection sweep phase has two main tasks: blob deletion and manifest deletion.

Here we leverage on the work done in !24 (merged), where we not only improved performance for S3 but also created a solid foundation to support improvements in other storage drivers.

Due to the above, for GCS, we only need to change a single driver function: DeleteFiles. The existing DriverSuite.TestDeleteFiles* driver-agnostic tests (also from !24 (merged)) are reused.

Blobs

Context

Blobs are stored in a single file named data, and its full path looks like the following (hex digest truncated):

docker/registry/v2/blobs/sha256/ff/fff95...2d823/data

You can read more about the path specification here.

In the current implementation, the storage.driver.gcs.Delete method is invoked for each blob to delete. In this method, a List request is sent to GCS to list the contents of a given path, and then a Delete request is sent to delete each object under that path.

storage.driver.gcs.Delete is a generic method designed to support both direct and recursive deletion, which is why it always lists the objects in a given path before actually deleting them.

Considering the above, for each blob to delete, the garbage collector always sends two sequential HTTP requests to GCS, a List followed by a Delete. That's 2N network requests, where N is the number of blobs to delete.

The worst-case scenario can, therefore, be described as 2N requests per blob to delete.

Proposal

Blobs are stored in a single file, named data, so there is no need to list the contents of their path before deleting them. We can remove the data files directly, cutting network requests in half, from 2N to N.

Instead of deleting blobs sequentially, we use goroutines to spawn concurrent Delete requests. To avoid an unbounded number of goroutines (which can lead to high resource usage or being rate limited by GCS) we limit the number of concurrent requests.

From the GCS documentation: "There is no limit to writes across multiple objects, which include uploading, updating, and deleting objects. Buckets initially support roughly 1000 writes per second and then scale as needed" (source). Considering this, we limit the number of active concurrent Delete requests to 1000. We have tested this limit by deleting 10000 objects with zero issues.

The GCS storage driver currently relies on a custom regulator to enforce a concurrency limit across all requests, but it has some flaws (#33). Considering this, here we decided to bypass the regulator and use a local semaphore in the DeleteFiles method.

Benchmarks

We ran a series of benchmarks to validate the expected improvements.

Methodology

We tested the deletion of 1, 10, 100 and 1000 blobs;
We compared v2.8.1-gitlab against this MR;
For each test we recorded the elapsed time (ns/op);
For each blob to delete, the benchmark creates and uploads a random image layer (~20MB) to GCS.

Setup

Benchmarks ran against a Standard GCS bucket on the europe-west1 region. The median latency between the development machine and the GCP region was 42ms.

Benchmarks were run using the following command:

go test -v -tags=include_gcs github.com/docker/distribution/registry/storage/driver/gcs -args -check.b -check.f "DriverSuite.BenchmarkRemoveBlobs*" -gocheck.btime 1ns

Results

# of blobs (`N`)	v2.8.1 (ns/op)	This MR (ns/op)	Improvement
1	474220379	339191277	28%
10	2011813669	946950968	53%
100	18148727744	7152357503	61%
1000	193753453367	64053449582	67%

These results show that for our test scenario, the run time can be up to 67% faster. The improvement should continue to increase as N grows.

Manifests

Context

Manifest revisions and tag references are stored in a single file. While for blobs this file is named data, for manifest revision and tag references it's called link.

Sample manifest revision full path (hex digest truncated):

docker/registry/v2/repositories/myorg/myproj/myimg/_manifests/revisions/sha256/82d...057/link

Sample manifest tag reference full path:

docker/registry/v2/repositories/myorg/myproj/myimg/_manifests/tags/index/sha256/82d...057/link

You can read more about the path specification here.

Like for blobs, in the current implementation, the storage.driver.gcs.Delete method is invoked for each manifest to delete. There is an unnecessary List request sent to GCS, followed by a Delete request for each manifest revision and tag references to delete. That's 2MT network requests, where M is the number of manifests to delete and T is the number of tag references per manifest.

The worst-case scenario can, therefore, be described as 2MT requests.

Proposal

The improvements proposed for blob deletion also apply to manifests, as they use the same DeleteFiles, which we're changing with this MR. This allows us to go from 2MT to MT requests.

Benchmarks

We ran a series of benchmarks to validate the expected improvements.

Methodology

The following test matrix was used and allowed us to simulate the deletion of M manifests with T tag references each:

# manifests (`M`)	# tags per manifest (`T`)
1	0
1	1
10	0
10	1
100	0
100	1
100	20

We compared v2.8.1-gitlab against this MR;
For each test we recorded the elapsed time (ns/op);
For each manifest, the benchmark creates and uploads a random test image (~40MB) to GCS.

Setup

The setup is the same used for blobs. Benchmarks were run using the following command:

go test -v -tags=include_gcs github.com/docker/distribution/registry/storage/driver/gcs -args -check.b -check.f "DriverSuite.BenchmarkRemoveManifests*" -gocheck.btime 1ns

Results

# manifests (`M`)	# tags per manifest (`T`)	v2.8.1 (ns/op)	This MR (ns/op)	Improvement
1	0	775504398	609772013	21%
1	1	1039227908	783854142	25%
10	0	5289274326	4087819040	23%
10	1	7661681882	4516032792	41%
100	0	64343605206	35716427792	44%
100	1	89696142906	45270206885	50%
100	20	437739607411	162902443774	63%

These results show that for our test scenario, the run time can be up to 63% faster. The improvement increases along with M and T.

Overall Performance Improvement

To test the overall performance improvements we have created a sample repository, in the same way we did for S3 (please see https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/9033#note_285751258).

We have then run garbage collection, comparing v2.8.1-gitlab against this MR.

Results

Logs from v2.8.1-gitlab: before.log
Logs from this MR: after.log

Both runs started with the same object count: 1079 objects, 6.771 GBytes in size, and ended with the same object count as well: 688 objects, 2.528 GBytes in size.

The sweep stage took 49s for v2.8.1-gitlab and only 0.26s for this MR, with 351 blobs and 20 manifests (including a tag per manifest) being successfully deleted. This means it's now 79 times faster, which translates to a performance improvement of 98.73%.

Conclusions

Unfortunately, we could not rely on bulk delete requests for GCS like we did for S3 (see gitlab#201963 (comment 285361834)), but the achieved performance improvements are still significant. These were only possible due to the foundations created in !24 (merged).

Edited Feb 20, 2020 by João Pereira

Optimize the garbage collector sweep phase for GCS

Blobs

Context

Proposal

Benchmarks

Methodology

Setup

Results

Manifests

Context

Proposal

Benchmarks

Methodology

Setup

Results

Overall Performance Improvement

Results

Conclusions

Merge request reports