Alternative Geo scheduler

This approach describes an architecture where Geo uses as simple replication strategy as I can think of. Migration would not require too many efforts I think. And it does not require changing everything. The main benefit here is that we don't need to compare a huge data sets.

Primary data design

The table geo_event_log stores all the events, including all the uploaded files, LFS and so on. It should not include any assets stored on the external object storage.

Secondary data design

Table sync_registy:

id	last_processed_event_id	last_processed_lfs_object_id	last_processed_upload_id	last_processed_project_id
1	456	23	56	23

failures table, the design isn't important for this document, something like existing registry table.

Streaming replication

As we have an access to geo_event_log table on each secondary server we can replay new events. The position is tracked using column sync_registy.last_processed_event_id.

Retry

All the event IDs that have failed and have to be retried are stored locally in the tracking database. Also, some events can be collapsed, for example, repository_update does not have to be put in this failures table twice as it does not make any sense.

Backfill strategy

We have created_at field for every secondary that means that we know what items have to be backfilled by looking at its created_at column. For example, to retrieve all the projects that have to be backfilled we request them using the clause WHERE projects.created_at <= #{current_node_created_at}. We need to fetch ordered by ID list so we can use sync_registry.last_processed_project_id

So the clause above becomes WHERE projects.created_at <= #{current_node_created_at} AND projects.id > #{last_processed_project_id}

All the failures should be treated as for streaming replication.

Pruning `geo_event_log` table

All the old events that is not referenced by any configured geo nodes have to be pruned. We can request last processed event_id by API from every secondary node. Actually, this is how it works now.

Event collapsing

Events can be grouped by type to not replay few updates when we really need one of them. Only repositories can benefit as files are idempotent.

Problems

We need to implement some approach to prevent the situation when the sync has failed but have not been put to failures table. I think there are many ways to do that.

/cc @stanhu @nick.thomas @brodock

Edited Apr 20, 2018 by Valery Sizov