Skip to content

WIP: Geo self-service (Po)Concept

Toon Claes requested to merge tc-geo-poc into master

This is my concept (not much proof yet) on how I see Geo self-service could work.

Notes

An event comes in, and it contains a blob of json data. I'm using json because it's easier to extend in the future and events are processed one-by-one, so deserialization is not costly. Using json should also make it easier to use any other event or pub/sub system in the future.

Initially I wanted to get rid of the log cursor. But thus far I didn't find a viable solution, so I'm keeping it for now. Anyway we should be able to swap it out later, and pull it outside the scope of Geo self-service.

When an event comes in, a worker is scheduled, depending on the workable_type and workable_event. That worker triggers a service, as usual.

If a service needs to replicate a git repo, there are basically 2 things you need to know:

  • path
  • shard

To find this info, there is some the glue that's need to be written.

Using the Geo::Replicable concern developers can define Blueprints (not to fond on the name at this point, but whatever) for there models. To support multiple git repos in a single model, Blueprints get assigned a handle. This Blueprint so far only sets a Trackable. This Trackable can be the default Geo::Trackable::GitRepository, or for legacy or deviating models a custom one needs to be written (e.g. as for project repo and wiki repo).

In the json event there should be enough info to instantize a Trackable. This Trackable should allow the service to read/write everything it needs:

  • repository_path
  • repository_shard
  • Set registry state, etc.

One exception: When you have a deleted event, we're including some more attributes in the event, because the original db model might be deleted before the secondary gets to handle the event.

There is also a generic Geo::Trackable::GitRepository. When a model responds to repository_path and repository_shard (exact method names open for discussion), this should work out-of-the-box. All the desired metadata (model name, model id, handle) are included in the json event by default.

Generating events

Including the Geo::Replicable concern also defines some method to generate events. When a developer want to generate an updated event, they can use object.geo_updated!(:handle) on the model instance. The rest is handled automatically.

Migration path

The idea was to generate json events also for project repository updates, but still write the registry metadata to the existing project_registry table in the tracking database. I'm not entirely sure it will be possible because the new generic git repository registry will use the state machine, and I'm not sure that will be compatible with existing columns of the project registry.

Problems

  • Counting things: With every git repo thrown together in one Registry table, it's harder to count the number of project repos, wiki repos... But is that really a problem? We're counting repos. And they all should be
  • Prioritization: Also by throwing all git repos together it's harder to give project repos a higher priority than wiki repos.

Resources to review

Related issues

#35540 (closed)

Edited by Toon Claes

Merge request reports