Allow Sidekiq jobs to use readonly database replicas

Currently Sidekiq always use primary, but not always needs. This means that all of Sidekiq's database traffic will hit the primary, whereas only some web database traffic will hit the primary. From https://dashboards.gitlab.net/d/000000144/postgresql-overview, we can see that none of the replicas are used as much as the primary, but all of them are in the same ballpark.

Overall, our metrics suggest that we spend more database time in web transactions (green line) than Sidekiq jobs (orange line), but Sidekiq is still a significant percentage:

image

We currently have no way to distinguish whether the given worker requires read-only or read-write access to data. It seems that if we would start annotating workers, we could call for majority of time Replicas instead, for operations that do not require read-write and super up-to date data, like:

  • all notifications
  • all webhooks
  • all ...

This would allow us to remove a number of SELECT statements from master.

groupscalability in GitLab.org / GitLab is spending a lot of effort of annotating workers, maybe following the same pattern we could do the same.

Edited by Sean McGivern