Reduce sweep memory footprint (!101) · Merge requests · GitLab.org / container-registry

Hayley Swimelar requested to merge sweep-mem-usage into release/2.8-gitlab Mar 12, 2020

Context

This MR seeks to reduce the memory usage of the sweep stage in very large registries.

While testing the GC performance improvements for GCS against a copy of a really large repository (10TB, over 100k blobs and 20k manifests eligible for deletion (some of them with thousands of tags), we found out that the GC process started to consume large amounts of RAM (>15GB) and is eventually killed due to an OOM error while sweeping manifests (thanks for raising this @hswimelar).

In this particular test scenario, some of the manifests eligible for deletion have hundreds and even thousands of tags that need to be validated, so we have a situation where the amount of work is M*T, where M is the number of manifests to delete and T is the number of tags to delete per manifest.

For manifests, the GC sweep stage process (either now or before the improvements) is to loop over the list of manifests eligible for deletion and delete them, with another nested loop for each tag to delete for that manifest. With the amount the files being deleted in this test case, the GC builds a huge list of file paths to delete, which causes the memory usage to climb drastically.

Solution

We need to:

Batch manifests eligible for deletion, using a safe max batch size that doesn’t put the host memory under severe pressure. - Profile other parts of the code, as there might be additional performance issues in the upstream implementation.
Drop the default max delete concurrency to a lower value, creating a safe margin for really large repositories.

Profiling

1. As-is (b563cd09)

I did some profiling using a temporary test, named DriverSuite.TestRemoveManifestsPathBuildLargeScale (!101 (b563cd09)).

I've run this test using the filesystem storage driver and skipped the underlying delete calls. The test simulates the deletion of 100k manifests with 1k tags each. So that's a total of 100M files to delete in a single garbage collection run...

It took 591s to run and here's how the CPU and memory usage looked during execution:

So it went up to 20GB in memory and then completed successfully. The fact that we're building the path for each file to delete and saving it in memory to then pass that list to the driver delete method puts a lot of pressure in the RAM for such large repositories.

2. Batching manifest deletes (48895274)

Here I've batched manifests in groups of 100. The default max delete concurrency has also been dropped to 150. We can go higher later, but at this point we should strive for stability, so this will be the starting default values (we'll look at making these configurable later on, see #47 (closed)).

This time it took 590s to complete and the memory usage stayed at around 2.5GB:

Batching deletes enabled us to drastically alleviate memory pressure. This works because, in this way, the Go garbage collector can kick in and free the space allocated to the file paths built for the previous batch.

However, the execution time was practically the same.

3. Skipping `storage.pathFor` (92f7ca45)

While investigating what could be causing the high memory usage (with @hswimelar's precious help), we found the storage.pathFor method to be surprisingly inefficient for what it does.

This method (unchanged in our fork) is responsible for assembling the path for all files in the registry storage. It works by concatenating a series for path segments and returning a string as result. The concatenation is done recursively in some situations and there are a lot of memory allocations going on.

Given that we're talking about a test case were we'll be building 100M file paths, the performance of this function is crucial for the execution time (more than memory usage, as the allocations it does are well bounded and don't seem to pollute the heap).

So I replaced the calls to the pathFor method with a simple concatenation of strings (there are many ways of joining strings in Go, but the oneline concatenation is the fastest one for this scenario, see https://gist.github.com/dtjm/c6ebc86abe7515c988ec for a good reference).

With this change, the test took only 153s to execute, which is a drastic reduction. Memory usage remains stable at 2.5GB as expected:

This shows how inefficient the pathFor function is. This is used almost everywhere in the source code, but unless used under extreme load conditions (like here), its impact is not that noticeable.

Probably we won't be refactoring this function. This is only used to address the metadata saved on the filesystem. Once we have a metadata database (gitlab#207147 (closed)), we'll no longer need it (except for layer blobs, which can be done separately).

4. Skipping verbose log calls (82912a22)

I found one more problem. Although the Debug log entries that I removed in !101 (82912a22) were not being executed (tests were run with log level set to info), I found the context.GetLoggerWithFields method (unchanged in our fork) to be inefficient as well.

Although the log message is not printed (as the Debug method returns immediately, as the log level was info), the GetLoggerWithFields wrapper is always evaluated and it converts the input arguments before passing them to the underlying logging library. Like pathFor, this is costly when under extreme load.

Fortunately, we found these log entries to be too verbose, so we would remove them regardless, but I was not expecting to see such a performance impact.

Without these log entries, the test took only 13.4s to complete. Memory usage remained unchanged:

These log entries are certainly not worth such a performance penalty. Like pathFor, this is used everywhere across the source code. This is something we'll look at refactoring later (#31 (closed)).

Conclusions

All the described possible improvements have been applied. These mitigate the concerns found during testing.

Edited Mar 13, 2020 by João Pereira

Reduce sweep memory footprint