Generating a repository backup is too slow
Testing server-side backups using a gitlab sandbox on GCP 1k reference arch shows that generating bundles is much slower than import by URL.
{"command.count":3,"command.cpu_time_ms":477924,"command.inblock":3374832,"command.majflt":1605,"command.maxrss":919384,"command.minflt":522792,"command.oublock":4156736,"command.real_time_ms":467274,"command.spawn_token_fork_ms":95,"command.spawn_token_wait_ms":4,"command.system_time_ms":39115,"command.user_time_ms":438809,"component":"gitaly.UnaryServerInterceptor","correlation_id":"01H8ZH7JEYTPPPAMJAVP4YMTE0","grpc.code":"OK","grpc.meta.auth_version":"v2","grpc.meta.client_name":"gitlab-sidekiq","grpc.meta.deadline_type":"unknown","grpc.meta.method_operation":"mutator","grpc.meta.method_scope":"repository","grpc.meta.method_type":"unary","grpc.method":"CreateRepositoryFromURL","grpc.request.deadline":"2023-08-29T08:16:19.053","grpc.request.fullMethod":"/gitaly.RepositoryService/CreateRepositoryFromURL","grpc.request.glProjectPath":"root/gitlab","grpc.request.glRepository":"project-1","grpc.request.payload_bytes":176,"grpc.request.repoPath":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","grpc.request.repoStorage":"default","grpc.response.payload_bytes":0,"grpc.service":"gitaly.RepositoryService","grpc.start_time":"2023-08-29T02:16:19.053","grpc.time_ms":468432.47,"level":"info","msg":"finished unary call with code OK","peer.address":"@","pid":30447,"remote_ip":"10.128.20.2","span.kind":"server","system":"grpc","time":"2023-08-29T02:24:07.517Z","user_id":"1","username":"root"}
{"command.count":3,"command.cpu_time_ms":352434,"command.inblock":2391872,"command.majflt":1040,"command.maxrss":571816,"command.minflt":297003,"command.oublock":15229760,"command.real_time_ms":423036,"command.spawn_token_fork_ms":84,"command.spawn_token_wait_ms":79,"command.system_time_ms":52580,"command.user_time_ms":299854,"component":"gitaly.UnaryServerInterceptor","correlation_id":"01H8ZHE59P972AG7K2ZAT2G1R7","grpc.code":"OK","grpc.meta.auth_version":"v2","grpc.meta.client_name":"gitlab-sidekiq","grpc.meta.deadline_type":"unknown","grpc.meta.method_operation":"mutator","grpc.meta.method_scope":"repository","grpc.meta.method_type":"unary","grpc.method":"CreateRepositoryFromURL","grpc.request.deadline":"2023-08-29T08:19:54.935","grpc.request.fullMethod":"/gitaly.RepositoryService/CreateRepositoryFromURL","grpc.request.glProjectPath":"root/www-gitlab-com","grpc.request.glRepository":"project-2","grpc.request.payload_bytes":192,"grpc.request.repoPath":"@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.git","grpc.request.repoStorage":"default","grpc.response.payload_bytes":0,"grpc.service":"gitaly.RepositoryService","grpc.start_time":"2023-08-29T02:19:54.936","grpc.time_ms":424723.5,"level":"info","msg":"finished unary call with code OK","peer.address":"@","pid":30447,"remote_ip":"10.128.20.2","span.kind":"server","system":"grpc","time":"2023-08-29T02:26:59.668Z","user_id":"1","username":"root"}
The test was backing up two repositories gitlab (1.8GB) and www-gitlab-com (7.2GB) simultaneously:
{"command":"create","gl_project_path":"root/gitlab","level":"info","msg":"started create","pid":31563,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-08-29T02:30:20.784Z"}
{"command":"create","gl_project_path":"root/www-gitlab-com","level":"info","msg":"started create","pid":31563,"relative_path":"@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.git","storage_name":"default","time":"2023-08-29T02:30:23.964Z"}
{"command":"create","error":"server-side create: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout","gl_project_path":"root/gitlab","level":"error","msg":"create failed","pid":31563,"relative_path":"@hashed/6b/86/6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b.git","storage_name":"default","time":"2023-08-29T03:43:20.740Z"}
{"command":"create","error":"server-side create: rpc error: code = Unavailable desc = keepalive ping failed to receive ACK within timeout","gl_project_path":"root/www-gitlab-com","level":"error","msg":"create failed","pid":31563,"relative_path":"@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35.git","storage_name":"default","time":"2023-08-29T03:43:20.740Z"}
Even if these backups did succeed, taking more than an hour is already much longer than a clone (i.e cloning would be a better backup strategy at this point).
- Generating the bundle on the server from the filesystem for gitlab takes around 5 minutes.
- The server was neither CPU (avg 3%) or I/O bound (avg 5.8 MIB/s, 100 reads/s). Network shows no more than baseline. Using git manually uses similar CPU and 3 times the disk throughput.
- Server-side does not transfer any bundles through GRPC.
- There's no praefect in the 1k reference arch.
Edited by James Fargher