Disable Geo on gprd
At the moment we still have Geo enabled on gprd:
- gprd is the primary node
- azure.gitlab.com is a phony secondary node
So at the moment we keep collecting Geo event logs at gprd. But no one is consuming them, so the database keeps piling up.
Now, we can easily delete all the Geo nodes, and the events are no longer generated, and eventually will be deleted by the Geo::PruneEventLogWorker
. But I'm a bit worries that clean up will cause performance degradation, which already happened in the past: gitlab-com/infrastructure#4231
In gitlab-org/gitlab-ee!5835 we made the Geo::PruneEventLogWorker
delete the rows in batches. But I've learned from gitlab-org/gitlab-ee!6175 that that's not enough, because we need to give the database time to do some vacuuming. See some of the discussions about it here and here
Proposed solution
In gitlab-org/gitlab-ee!6175 I've made the deletion of created/updated/deleted/etc. event rows go in batches, and reschedule the migration every 5min to do the next batch. Something similar also needs to be done for the "regular" pruning of events.
After that is merged and deployed, we can disable Geo on gprd (remove the Geo nodes).