Currently, Geo only supports replicating Docker registries from a primary to a secondary using some form of object storage. When using distributed object storage (e.g. S3) for a Docker Registry, the primary and secondary Geo nodes can use the same storage type. This approach does not rely on Geo's native replication ability.
Docker registries on local storage, however, are not yet supported. We are aiming to provide a native Geo, storage-agnostic way of setting up a Registry on a secondary Geo node.
Intended users
Large enterprise customers
Further details
Proposal
Create a storage-agnostic way to setup Registry on a secondary Geo node. There are two steps here:
Customer mentioned they are looking to use Ceph for an on-premise S3 solution, and that they will likely use the federated replication that Ceph provides. We may still need to synchronize metadata.
So we have two options: synchronization using pull/push or synchronization on filesystem level (rsync, s3sync or using dedicated object storage like Ceph).
Some of these options imply some delay until an image is propagated to a secondary registry so the regular CI/CD pipeline where we usually pull image just after it was pushed has to be modified in some way, like adding some delay before trying to pull an image.
This is proxying. There's no concept of mirroring of Registry. Registry works well in distributed architecture as long as storage is replicated. So, read-only registry is fully possible as long as we have ex. s3sync.
@ayufan Right, I just used the wrong term (they sometimes call it "mirror" in the page above). The thing is that if we use rsync we introduce some delay. If we use AWS s3 or any compatible service then we don't. For Ceph service it depends AFAIK, depending on how we set up a replication.
We cover setups where distributes storage is used like s3 or any other. This would require to write a document on how to setup registry for a secondary node which is in read-only mode and how s3 should be configured. The omnibus package already allows us to setups custom configuration for docker registry.
Implement a system which replicated events from a primary node. This step does not require using a distributed file system.
@jramsay I was not able to work on this too much, mostly learned how the registry is working and how it can be configured. I'm working on the first iteration.
@vsizov The notification idea is actually a very good approach. We started evaluating that recently as next steps to have everything tracked in the database. We could then relay the notifications/events to pull images in Geo registry.
Using a file system replication is painful because it does not guaranty the consistency of highly cross-referenced data when it's updated often. For example, manifest file only contains text data and can be replicated much faster than referenced blobs.
Implementing replication on a user-space level (doing pull/push) guarantees consistency. This is why replication of registry events from a primary node will be reliable and convenient.
There is some intersection between this task and gitlab-ce#29639 so I will focus my efforts on transporting data from one registry to another and then we integrate it with CI team's implementation of registry events (or I will implement it if it's not ready when I finish my part of the work)
we don't have a bundled docker daemon with omnibus right? so we can't just use docker client to pull and push images (by docker daemon) to replicate them. I don't really like the idea of using their Registry API directly to download and push images. Any ideas? What if we say you need to install docker daemon to secondary node to be able to pull/push images? It's a lot easier for us to talk to internal daemon and say "please download image for us" than talking directly to external registry via the API.
Kamil:
I think that we might actually reimplement registry API, it will be easier and less error prone that docker-based approach.
I think I need to describe my status. On Friday, I started using our ContainerRegistry::Client but then I realized that we don't want rails environment to be loaded to a process that will copy images from docker registry (GeoCursorDaemon doesn't use it either) so I moved my code to a separate service that doesn't require the rails environment. The second problem was that none of the HTTP libraries we use support both download and upload streaming at once. Using two different libraries in the same service does not look nice. Today I found that RestClient is capable to do both things so I rewrote the code I already have to RestClient. BTW it's already on our dependency list which is great.
Today I finished the base skeleton of image transfer which is capable to transfer images. it does not take into account some edge cases, like deprecated manifests versions and other stuff. It also needs refactoring. That said, the first part of this issue is close to be finished.
Is there a reason we can't use some form of docker pull to just download the images that way?
We can use it, and that would be the easiest way but that's a dependency, we don't have docker engine bundled with our Omnibus package. Docker client only tells docker daemon to pull image from a particular registry that means that we would need to add docker daemon and docker client to omnibus and maintain them both. So we just use Docker registry API directly like Docker daemon does but we don't store any state, we just copy images and that's it.
What happens if we just did a straight S3 sync between buckets? (e.g. do we miss metadata that is needed)?
Registry's storage contains two types of objects: manifests and layers. A manifest (metadata) contains references to layers. There are few types of manifests but that's not important right now. What Kamil says is that big blobs (layers) will be synced slower than manifests (text files). In this case, if docker registry's repository is updated often we may never be in a consistent state when we use filesystem-level sync.
Just for Geo needs, we need a simple version of events, not what is described there. But we can't have two types of event system (it's costly) so we can use event system developed for CI/CD needs. At this time I'm working on image transfer system. So we can reconsider what event system we use. In other words, right now I'm creating a service like Geo::ContainerRegistryImageReplicator.new('gitlab-org/gitlab-ce', 'latest').transfer_image which is called on a secondary node to replicate an image.
Working on an authentication process for the image replicator and I found two options to do that:
When an event is generated on the primary side we can put JWT token (only pull access) to an event payload (which will be in a database table most probably). The secondary will use this token to pull an image from the primary registry. Downside: Sensitive information in a database but with attr_encrypted I don't think it's a problem, moreover db_secret_key is already synced between all the nodes by default.
As omnibus package allows us to use custom registry secret key (see GitLab Container Registry administration - GitLab Documentation) we can use the same registry secret key on all the nodes. This will allow us to generate single JWT token which is valid for all the nodes. Downsides: a) Geo setup instructions will be even more complex. b) Distributed secret key makes the attack surface broader. Taking into account that the key can be used to generate a token with “push” access it’s even worse.
@vsizov the secondary should be able to generate Geo-specific JWTs as and when it wants to use them. They're short-lived, so putting them into the database suggests a bad architecture (but may be acceptable). Consider how HTTP repository cloning makes use of JWTs.
Note that we're going to modify the setup instructions to replicate gitlab-secrets.json across all Geo nodes anyway: https://gitlab.com/gitlab-org/gitlab-ee/issues/4253 so that might answer the question for you.
@nick.thomas I think it's OK what you are saying when service provider which uses JWT token is under our control. But the docker registry is a third party service provider for us. See Token Authentication Implementation UPDATE: Maybe I didn't understand you correctly though...
@nick.thomas I just checked how HTTP clone for Geo works, we generate secret keys when the node is created and save it to the database but due to the fact that GeoNode record is available on both sides, we can create and use JWT tokens. Do you propose to make a request to some primary's endpoint to get valid JWT token for Registry? Is this what you suggest?
UPDATE: Not sure if this is what you proposed but I like it :)
@vsizov my point was that since both the primary and the secondary will be using the same registry: http_secret once #4253 (closed) is completed, the secondary should be able to generate JWTs that the primary will accept.
OK, so by 10.4 we can rely on the single certificate for both registries, therefore, we can use the same secret key to produce universal JWT token for both registries. This makes things a lot easier. Today I'm going to start implementing event system the second and last phase of this issue. I will discuss it with Kamil as CI team won't work on this right now but we need to find some optimal solution that meets the requirements for all of us.
Omnibus MR that adds .container_registry_secret file similar to .gitlab_shell_secret. This secret should be used to authenticate notification calls from the registry to GitLab API. This MR should care about setup of notifications using generated token.
The same for our GDK
Create API endpoint in GitLab app to process notifications. It has been created already but we had a call with @ayufan and he pointed out that we need a more advanced data structure with explicit tag version.
Implement the initial data filling from the primary registry for existing systems. Ideally, it should look like our static assets sync.
The thing is that currently, Docker Registry API does not provide any endpoints to control tag versions, so if some tag has few digests we don't know that. By listening to registry events we can create records in a database and so we'll have a number of benefits, like a quick access to the actual registry statistic, ability to replicate images to secondary nodes as is and have some progress information.
The second thing we concluded is that we cannot validate consistency of registry of Geo vs Master reliably enough... I don't think it should stop us but this problem exists.