Partition the events table by year
We have a prune events worker which prunes user activity every 2 years.
This was extended from 12 months in https://gitlab.com/gitlab-org/gitlab-ce/issues/52246, where we implemented this short-term fix to give us some additional time to consider a scaling strategy for the related table (the events table). This fix was merged in 11.4, which means a more permanent solution must take place by October 2019.
This data is very useful, and we should not ever prune data unless explicitly done by an instance administrator.
Further details
See context from @yorickpeterse in https://gitlab.com/gitlab-org/gitlab-ce/issues/24244#note_60995986 on the DB challenges.
Proposal
- Partition the
eventstable by range on thecreated_atcolumn. - Tables should be created for each year with a schema like
events-yyyyspecifying the year of the relevant records. - Once
eventsis no longer being pruned, we should removeprune_old_events_worker.rb.
Links / references
Edited by Jeremy Watson (ex-GitLab)