Recording of used SSH keys using Sidekiq produces hundreds of thousands of jobs
MR https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/8113 added the tracking of whenever SSH keys are used via SSH or HTTPs. This is implemented by scheduling a Sidekiq job for every push/pull, which then performs the actual updating.
For GitLab.com this results in at 400 000 Sidekiq jobs at peak level, with the lowest amount being at least 10 000 jobs (all per minute or so). This in turn means that at peak we run:
- 400 000 database updates, creating potentially 400 000 dead tuples in the process (increasing vacuum load)
- 400 000 queries to find a key
So in the worst case that's 800 000 database queries, just to update a timestamp.
Since this data is not crucial we should use an exclusive lease. This lease should ensure that per user we only update the row at most once a day. If this has to be more accurate that's also possible, but the absolutely bare minimum should be at most once an hour.