Partition the events table by year
We have a prune events worker which prunes user activity every 2 years.
This was extended from 12 months in https://gitlab.com/gitlab-org/gitlab-ce/issues/52246, where we implemented this short-term fix to give us some additional time to consider a scaling strategy for the related table (the events
table). This fix was merged in 11.4, which means a more permanent solution must take place by October 2019.
This data is very useful, and we should not ever prune data unless explicitly done by an instance administrator.
Further details
See context from @yorickpeterse in https://gitlab.com/gitlab-org/gitlab-ce/issues/24244#note_60995986 on the DB challenges.
Proposal
- Partition the
events
table by range on thecreated_at
column. - Tables should be created for each year with a schema like
events-yyyy
specifying the year of the relevant records. - Once
events
is no longer being pruned, we should removeprune_old_events_worker.rb
.