Data retention for Geo deleted events
The primary prunes the Geo event log on a regular basis (every 2h). But there was a bug that resulted in only the deletion of the
geo_event_log rows, not the associated rows in
!6175 (merged) will fix that issue. But during investigation of incorrect sync numbers (in gitlab-com/migration#295 (closed)), it has been proven very useful to still have the deleted events, even after they are handled by all the nodes.
Looking at the current numbers:
At the moment there were in total about 500k deleted events generated, and more than 60M updated events.
So maybe we can make the pruning less aggressive:
geo_repository_updated_eventslike we have been doing. These are the majority of all events (over 95%), but not very helpful for troubleshooting
- keep the
geo_repository_deleted_eventsfor x months after they're created and handled by all secondaries
repository_renamed_eventsalso might be useful
repository_created_eventsaren't critical, because the ProjectSyncWorker will pick up new projects any way